xfs
[Top] [All Lists]

2.6.21.3 Oops (was Re: XFS internal error xfs_da_do_buf(2) at line 2087

To: "Marco Berizzi" <pupilla@xxxxxxxxxxx>
Subject: 2.6.21.3 Oops (was Re: XFS internal error xfs_da_do_buf(2) at line 2087 of file fs/xfs/xfs_da_btree.c. Caller 0xc01b00bd)
From: "Satyam Sharma" <satyam.sharma@xxxxxxxxx>
Date: Mon, 11 Jun 2007 19:19:26 +0530
Cc: linux-kernel@xxxxxxxxxxxxxxx, "David Chinner" <dgc@xxxxxxx>, xfs@xxxxxxxxxxx, "Andrew Morton" <akpm@xxxxxxxxxxxxxxxxxxxx>, "Christoph Lameter" <clameter@xxxxxxx>
Dkim-signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:mime-version:content-type; b=bDXgf3nwxrvoZQ+A0OVNX+KUvLa9No/mQrVv5fjghOIRe5pJyqYhICL7gOhhv/RfOxUTavsrc1A8LxJRg+ZD/VaRrk8CW9EYpLR4O8CPDnmgpdHdBzYk0CdpM7EdGcmdFsB6uXb2+fBlSJlTeLXWaRvIae7LK9oJdzJ9Y/1sZvo=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:mime-version:content-type; b=BCclVjOWg779i9ikUGdBiqhsmHAV6/lPrURCAhLXkUo22EVeUImBIffgNqBPYXdZCaLzpNcnW8kdWfJK6sx1ZMQwdzHFOvJ9wtYtkX3t/31EMEf8x6nRhdK86n08PRzZyx2V903qmrhA8/AsvVCSnm5cYQk+DQnZHYidkt4aJNs=
Sender: xfs-bounce@xxxxxxxxxxx
Hi Marco,

[ Re-adding David, XFS, Andrew and Christoph; this appears to be
some SLAB / fs (?) issue, so I'm a little out of my depth here :-) ]

> On 6/8/07, Marco Berizzi <pupilla@xxxxxxxxxxx> wrote:
>> After few hours linux has crashed with this message:
>> BUG: at arch/i386/kernel/smp.c:546 smp_call_function()

Well, _this_ particular "bug" (due to a WARN_ON(irqs_disabled) that should be avoided when we're panicing) is resolved in the latest 22-rc4 / -git kernel. However, interestingly, this is not the problem that crashed your system in the first place. Your box had *already* paniced, due to unknown reasons, and _then_ hit the aforementioned WARN_ON.

> Which kernel (exactly) was this

2.6.21.3

Ok, so apparently what happened here was this:

Some RCU callback (that calls kmem_cache_free()) oopsed and
panic'ed his box. [ Marco had experienced fs issues lately, so we could
suspect file_free_rcu() here, but I can't really tell from the stack trace;
BTW whats with the rampant disease in the kernel to declare as inline
even those functions exclusively meant to be dereferenced and passed
as pointers to call_rcu()?! ]

Sadly, 21.3 (21.4 too, actually) had a busticated smp_send_stop()
that would always go WARN_ON when called by panic() as mentioned
above, which meant that the original dmesg stuff outputted by the oops +
panic got scrolled up and all that we had on the screen was the stack trace
for the WARN_ON when you snapped the pic -- the system didn't write to
syslog messages in time and so the extract below isn't quite useful :-(

>  and does this occur
> reproducibly?

I don't know. I try to explain. With all debugging options
enabled 2.6.21.x has never crashed. After two days 2.6.21.3
was running without any debug options, it has crashed.
Tomorrow morning I will start that linux box with linux 2.6.21.3
without any debug options, and I will keep you informed
(friday evening I have switched back to 2.6.21.3 with debug
options enabled, so the machine doesn't crash during the week
end: this system is my company firewall.)

I hope you're able to reproduce this with various debug options enabled (and/or also try the latest 22-rc4 or -git kernel). Could you please send the .config that crashed too?

Anyway, I'd have to leave this upto the others Cc:'ed here. Doesn't look
like a known / resolved issue, though.

> Also, could you please send the dmesg,

Jun 4 20:53:05 Pleiadi kernel: sanitize start
Jun 4 20:53:05 Pleiadi kernel: sanitize end
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 0000000000000000
size: 000000000009ac00 end: 000000000009ac00 type: 1
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() type is E820_RAM
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 000000000009ac00
size: 0000000000005400 end: 00000000000a0000 type: 2
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 00000000000ce000
size: 0000000000002000 end: 00000000000d0000 type: 2
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 00000000000e0000
size: 0000000000020000 end: 0000000000100000 type: 2
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 0000000000100000
size: 000000003fdf0000 end: 000000003fef0000 type: 1
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() type is E820_RAM
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 000000003fef0000
size: 000000000000b000 end: 000000003fefb000 type: 3
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 000000003fefb000
size: 0000000000005000 end: 000000003ff00000 type: 4
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 000000003ff00000
size: 0000000000080000 end: 000000003ff80000 type: 1
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() type is E820_RAM
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 000000003ff80000
size: 0000000000080000 end: 0000000040000000 type: 2
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 00000000e0000000
size: 0000000010000000 end: 00000000f0000000 type: 2
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 00000000fec00000
size: 0000000000100400 end: 00000000fed00400 type: 2
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 00000000fee00000
size: 0000000000100000 end: 00000000fef00000 type: 2
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 00000000ffb00000
size: 0000000000100000 end: 00000000ffc00000 type: 2
Jun 4 20:53:05 Pleiadi kernel: copy_e820_map() start: 00000000fff00000
size: 0000000000100000 end: 0000000100000000 type: 2
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 0000000000000000 -
000000000009ac00 (usable)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 000000000009ac00 -
00000000000a0000 (reserved)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 00000000000ce000 -
00000000000d0000 (reserved)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 00000000000e0000 -
0000000000100000 (reserved)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 0000000000100000 -
000000003fef0000 (usable)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 000000003fef0000 -
000000003fefb000 (ACPI data)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 000000003fefb000 -
000000003ff00000 (ACPI NVS)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 000000003ff00000 -
000000003ff80000 (usable)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 000000003ff80000 -
0000000040000000 (reserved)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 00000000e0000000 -
00000000f0000000 (reserved)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 00000000fec00000 -
00000000fed00400 (reserved)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 00000000fee00000 -
00000000fef00000 (reserved)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 00000000ffb00000 -
00000000ffc00000 (reserved)
Jun 4 20:53:05 Pleiadi kernel: BIOS-e820: 00000000fff00000 -
0000000100000000 (reserved)
Jun 4 20:53:05 Pleiadi kernel: Zone PFN ranges:
Jun 4 20:53:05 Pleiadi kernel: DMA 0 -> 4096
Jun 4 20:53:05 Pleiadi kernel: Normal 4096 -> 229376
Jun 4 20:53:05 Pleiadi kernel: HighMem 229376 -> 262016
Jun 4 20:53:05 Pleiadi kernel: early_node_map[1] active PFN ranges
Jun 4 20:53:05 Pleiadi kernel: 0: 0 -> 262016
Jun 4 20:53:05 Pleiadi kernel: ACPI: RSDP 000F6BA0, 0024 (r2 PTLTD )
Jun 4 20:53:05 Pleiadi kernel: ACPI: XSDT 3FEF5381, 004C (r1 PTLTD ^I
XSDT 6040001 LTP 0)
Jun 4 20:53:05 Pleiadi kernel: ACPI: FACP 3FEF5441, 00F4 (r3 FSC
6040001 F4240)
Jun 4 20:53:05 Pleiadi kernel: ACPI: DSDT 3FEF5535, 597B (r1 FSC
D1649 6040001 MSFT 2000002)
Jun 4 20:53:05 Pleiadi kernel: ACPI: FACS 3FEFBFC0, 0040
Jun 4 20:53:05 Pleiadi kernel: ACPI: SPCR 3FEFAEB0, 0050 (r1 PTLTD
$UCRTBL$ 6040001 PTL 1)
Jun 4 20:53:05 Pleiadi kernel: ACPI: MCFG 3FEFAF00, 0040 (r1 PTLTD
MCFG 6040001 LTP 0)
Jun 4 20:53:05 Pleiadi kernel: ACPI: APIC 3FEFAF40, 0098 (r1 PTLTD ^I
APIC 6040001 LTP 0)
Jun 4 20:53:05 Pleiadi kernel: ACPI: BOOT 3FEFAFD8, 0028 (r1 PTLTD
$SBFTBL$ 6040001 LTP 1)
Jun 4 20:53:05 Pleiadi kernel: Processor #0 15:4 APIC version 20
Jun 4 20:53:05 Pleiadi kernel: Processor #1 15:4 APIC version 20
Jun 4 20:53:05 Pleiadi kernel: IOAPIC[0]: apic_id 2, version 32,
address 0xfec00000, GSI 0-23
Jun 4 20:53:05 Pleiadi kernel: IOAPIC[1]: apic_id 3, version 32,
address 0xfec80000, GSI 24-47
Jun 4 20:53:05 Pleiadi kernel: IOAPIC[2]: apic_id 4, version 32,
address 0xfec80800, GSI 48-71
Jun 4 20:53:05 Pleiadi kernel: IOAPIC[3]: apic_id 5, version 32,
address 0xfec84000, GSI 72-95
Jun 4 20:53:05 Pleiadi kernel: IOAPIC[4]: apic_id 6, version 32,
address 0xfec84800, GSI 96-119
Jun 4 20:53:05 Pleiadi kernel: Enabling APIC mode: Flat. Using 5 I/O
APICs
Jun 4 20:53:05 Pleiadi kernel: Allocating PCI resources starting at
50000000 (gap: 40000000:a0000000)
Jun 4 20:53:05 Pleiadi kernel: Built 1 zonelists. Total pages: 259969
Jun 4 20:53:05 Pleiadi kernel: PID hash table entries: 4096 (order: 12,
16384 bytes)
Jun 4 20:53:05 Pleiadi kernel: Detected 3200.428 MHz processor.
Jun 4 20:53:05 Pleiadi kernel: Console: colour VGA+ 80x25
Jun 4 20:53:05 Pleiadi kernel: Dentry cache hash table entries: 131072
(order: 7, 524288 bytes)
Jun 4 20:53:05 Pleiadi kernel: Inode-cache hash table entries: 65536
(order: 6, 262144 bytes)
Jun 4 20:53:05 Pleiadi kernel: virtual kernel memory layout:
Jun 4 20:53:05 Pleiadi kernel: fixmap : 0xfff9d000 - 0xfffff000
( 392 kB)
Jun 4 20:53:05 Pleiadi kernel: pkmap : 0xff800000 - 0xffc00000
(4096 kB)
Jun 4 20:53:05 Pleiadi kernel: vmalloc : 0xf8800000 - 0xff7fe000
( 111 MB)
Jun 4 20:53:05 Pleiadi kernel: lowmem : 0xc0000000 - 0xf8000000
( 896 MB)
Jun 4 20:53:05 Pleiadi kernel: .init : 0xc039f000 - 0xc03ce000
( 188 kB)
Jun 4 20:53:05 Pleiadi kernel: .data : 0xc02fd400 - 0xc0398114
( 619 kB)
Jun 4 20:53:05 Pleiadi kernel: .text : 0xc0100000 - 0xc02fd400
(2037 kB)
Jun 4 20:53:05 Pleiadi kernel: Checking if this processor honours the
WP bit even in supervisor mode... Ok.
Jun 4 20:53:05 Pleiadi kernel: Calibrating delay using timer specific
routine.. 6403.78 BogoMIPS (lpj=32018905)
Jun 4 20:53:05 Pleiadi kernel: Mount-cache hash table entries: 512
Jun 4 20:53:05 Pleiadi kernel: monitor/mwait feature present.
Jun 4 20:53:05 Pleiadi kernel: using mwait in idle threads.
Jun 4 20:53:05 Pleiadi kernel: CPU0: Intel(R) Xeon(TM) CPU 3.20GHz
stepping 0a
Jun 4 20:53:05 Pleiadi kernel: Booting processor 1/1 eip 2000
Jun 4 20:53:05 Pleiadi kernel: Calibrating delay using timer specific
routine.. 6400.45 BogoMIPS (lpj=32002267)
Jun 4 20:53:05 Pleiadi kernel: monitor/mwait feature present.
Jun 4 20:53:05 Pleiadi kernel: CPU1: Intel(R) Xeon(TM) CPU 3.20GHz
stepping 0a
Jun 4 20:53:05 Pleiadi kernel: ENABLING IO-APIC IRQs
Jun 4 20:53:05 Pleiadi kernel: migration_cost=142
Jun 4 20:53:05 Pleiadi kernel: Setting up standard PCI resources
Jun 4 20:53:05 Pleiadi kernel: ACPI: Device [PS2M] status [00000008]:
functional but not present; setting present
Jun 4 20:53:05 Pleiadi kernel: ACPI: Device [ECP] status [00000008]:
functional but not present; setting present
Jun 4 20:53:05 Pleiadi kernel: ACPI: Device [COM1] status [00000008]:
functional but not present; setting present
Jun 4 20:53:05 Pleiadi kernel: PCI quirk: region f000-f07f claimed by
ICH4 ACPI/GPIO/TCO
Jun 4 20:53:05 Pleiadi kernel: PCI quirk: region f180-f1bf claimed by
ICH4 GPIO
Jun 4 20:53:05 Pleiadi kernel: PCI: PXH quirk detected, disabling MSI
for SHPC device
Jun 4 20:53:05 Pleiadi kernel: PCI: PXH quirk detected, disabling MSI
for SHPC device
Jun 4 20:53:05 Pleiadi kernel: ACPI: PCI Interrupt Link [LNKA] (IRQs 3
4 5 6 7 9 10 *11 12 14 15)
Jun 4 20:53:05 Pleiadi kernel: ACPI: PCI Interrupt Link [LNKB] (IRQs 3
4 5 6 7 *9 10 11 12 14 15)
Jun 4 20:53:05 Pleiadi kernel: ACPI: PCI Interrupt Link [LNKC] (IRQs 3
4 *5 6 7 9 10 11 12 14 15)
Jun 4 20:53:05 Pleiadi kernel: ACPI: PCI Interrupt Link [LNKD] (IRQs 3
4 5 6 7 9 *10 11 12 14 15)
Jun 4 20:53:05 Pleiadi kernel: ACPI: PCI Interrupt Link [LNKE] (IRQs 3
4 5 6 7 9 10 11 12 14 15) *0, disabled.
Jun 4 20:53:05 Pleiadi kernel: ACPI: PCI Interrupt Link [LNKF] (IRQs 3
4 5 6 7 9 10 11 12 14 15) *0, disabled.
Jun 4 20:53:05 Pleiadi kernel: ACPI: PCI Interrupt Link [LNKG] (IRQs 3
4 5 6 7 9 10 11 12 14 15) *0, disabled.
Jun 4 20:53:05 Pleiadi kernel: ACPI: PCI Interrupt Link [LNKH] (IRQs 3
*4 5 6 7 9 10 11 12 14 15)
Jun 4 20:53:05 Pleiadi kernel: IP route cache hash table entries: 32768
(order: 5, 131072 bytes)
Jun 4 20:53:05 Pleiadi kernel: TCP established hash table entries:
131072 (order: 8, 1572864 bytes)
Jun 4 20:53:05 Pleiadi kernel: TCP bind hash table entries: 65536
(order: 7, 524288 bytes)
Jun 4 20:53:05 Pleiadi kernel: highmem bounce pool size: 64 pages
Jun 4 20:53:05 Pleiadi kernel: PNP: PS/2 controller doesn't have AUX
irq; using default 12
Jun 4 20:53:05 Pleiadi kernel: nf_conntrack version 0.5.0 (8188
buckets, 65504 max)
Jun 4 20:53:05 Pleiadi kernel: ip_tables: (C) 2000-2006 Netfilter Core
Team
Jun 4 20:53:05 Pleiadi kernel: Using IPI Shortcut mode
Jun 4 20:53:05 Pleiadi kernel: VFS: Mounted root (xfs filesystem)
readonly.

> stack trace, etc for when this happened?

I have only a monitor bitmap. Tell me if you want it.

[ Marco sent me the stack trace photo off-list; attached herewith. ]

Satyam

Attachment: 100-0081_IMG1.JPG
Description: JPEG image

<Prev in Thread] Current Thread [Next in Thread>