I just saw the same crash this morning, my first experience with
2.6.0-test2.
4 x Xeon processors
2 GB ram
fibre channel drives
more details upon request...
I was running SPEC SFS, so 144 nfs client processes spread across 8 nfs
client machines attached by gigabit ethernet to my server under test. 36
file systems, and SPEC was trying to write 555 MB to each filesystem as fast
as the NFS/network infrastructure would allow. I'm running stock
2.6.0-test2 with the qlogic qla2xxx fibre channel driver version 8.00.00b4
(http://sourceforge.net/projects/linux-qla2xxx/) and the following patch to
the driver code to make the luns visible.
--- linux/drivers/scsi/qla2xxx/qla_os.c~ Tue Jul 29 08:39:12 2003
+++ linux/drivers/scsi/qla2xxx/qla_os.c Tue Jul 29 08:38:56 2003
@@ -2634,6 +2634,7 @@
qla2x00_cfg_display_devices();
scsi_add_host(host, &pdev->dev);
+ scsi_scan_host(host);
return 0;
Here's the oops/BUG listing.
Thanks,
Erik Habbinga
Hewlett Packard
# ------------[ cut here ]------------
kernel BUG at fs/xfs/pagebuf/page_buf.c:1291!
invalid operand: 0000 [#1]
CPU: 6
EIP: 0060:[<c0213b12>] Not tainted
EFLAGS: 00010202
EIP is at bio_end_io_pagebuf+0xc2/0x12e
eax: 00000001 ebx: f6fb2380 ecx: 00000000 edx: c185a778
esi: f060fbc0 edi: 00000000 ebp: ed2bf600 esp: f5f3d9fc
ds: 007b es: 007b ss: 0068
Process nfsd (pid: 5007, threadinfo=f5f3c000 task=f56e46a0)
Stack: 00000001 00000000 00000046 c26a0a00 00000009 00001000 ed2bf600
00000000
00000200 00000200 c01562d7 ed2bf600 00000200 00000000 01801b1f
00000200
ed2bf600 c0269374 ed2bf600 00000200 00000000 c2654600 ed2bf600
00000000
Call Trace:
[<c01562d7>] bio_endio+0x55/0x7a
[<c0269374>] __end_that_request_first+0x204/0x224
[<c029519f>] scsi_end_request+0x3b/0xbc
[<c0295502>] scsi_io_completion+0x144/0x442
[<c0293648>] scsi_delete_timer+0x16/0x30
[<c02dda90>] sd_rw_intr+0x4e/0x198
[<c020501a>] xfs_trans_commit+0x116/0x3d4
[<c02045e3>] xfs_trans_dup+0xed/0xfc
[<c01efbd5>] xfs_itruncate_finish+0x24d/0x430
[<c020c450>] xfs_inactive_free_eofblocks+0x26c/0x2ba
[<c021007b>] xfs_rwunlock+0x1/0x3a
[<c020cb4e>] xfs_release+0x94/0xdc
[<c021665d>] linvfs_release+0x1d/0x24
[<c01518d0>] close_private_file+0x28/0x2a
[<c01954f5>] nfsd_close+0x1d/0x3c
[<c0195c97>] nfsd_write+0x20d/0x348
[<c0336975>] udp_push_pending_frames+0x12d/0x244
[<c03374d3>] udp_sendpage+0xf3/0x2a6
[<c035ebe2>] svcauth_unix_accept+0x26a/0x28e
[<c01929cc>] nfsd_proc_write+0xa8/0x122
[<c0191a74>] nfsd_dispatch+0xe8/0x1e5
[<c019198c>] nfsd_dispatch+0x0/0x1e5
[<c035ac4b>] svc_process+0x4eb/0x673
[<c01917e2>] nfsd+0x1de/0x388
[<c0191604>] nfsd+0x0/0x388
[<c010703d>] kernel_thread_helper+0x5/0xc
Code: 0f 0b 0b 05 4f a6 37 c0 eb a9 89 d0 e8 d3 12 f2 ff eb a0 81
<0>Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing
Sorry for the repost
I forgot to cc the list.....
-----Original Message-----
From: Kostadin Todorov Karaivanov [ <mailto:larry@xxxxxxxxx>]
Sent: Tuesday, July 29, 2003 10:52 AM
To: 'Nathan Scott'
Subject: RE: The infamous BUG in page_buff.c
> -----Original Message-----
> From: Nathan Scott [ <mailto:nathans@xxxxxxx>]
> Sent: Tuesday, July 29, 2003 10:19 AM
> To: k.karaivanov@xxxxxxxxx
> Cc: linux-xfs@xxxxxxxxxxx
> Subject: Re: The infamous BUG in page_buff.c
>
>
> On Tue, Jul 29, 2003 at 09:58:42AM +0300, Kostadin Todorov
> Karaivanov wrote:
> > Short summary:
> > I have seen at least 3 reports for , I beleave, same BUG. one of
> > witch is mine. It's present since 2.5.6x till now. I know you are
> > focused on 2.4 branch but still...
> >
> > reference:
> > <http://marc.theaimsgroup.com/?l=linux-kernel&m=105941333012271&w=2>
> > <http://www.ussg.iu.edu/hypermail/linux/kernel/0306.3/0357.html>
> > <http://marc.theaimsgroup.com/?l=linux-xfs&m=105410871804737&w=2>
> >
>
> A reproducible test case would be a big help here.
Alas it's not so easy, at least for me.
Sometimes it happens while I sit quietly and do nothing on that PC the
other time it happens when I recompile kernel. To catch my oops I was
forced to make an endless loop of kernel make bzImage; make modules;
make clean on the other hand yesterday I try the same, plus some
postgresql benchmarks
in the background just to higher the load and nothing happened, 2 hours
later when I have given up and stop all the tnings and when I was doing
REALY nothing the machine dies 8-( .
>
> thanks.
>
> --
> Nathan
>
|