xfs
[Top] [All Lists]

RE: Linux 2.4.18 freeze running dbench 1.3

To: <linux-xfs@xxxxxxxxxxx>
Subject: RE: Linux 2.4.18 freeze running dbench 1.3
From: Christian Røsnes <christian@xxxxxxxxx>
Date: Mon, 4 Mar 2002 14:20:10 +0100
Importance: Normal
In-reply-to: <LKEBKIBHKNEKEKPHFNCCCEEOCOAA.christian@xxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
As per Mr Owens' description, I've now captured the kdb debugging
trace - during the dbench crash.  Is this related to XFS, or
could it be something some other part of the kernel ? (Happens
on both my DL380 G2s)

Any other pointers to where I should look is greatly appreciated.
Thank you.

Christian


-------------------- KDB DEBUG OUTPUT -----------------------

[root@dl02 dbench]# ./dbench 10
10 clients started
......invalid operand: 0000
CPU:    1
EIP:    0010:[<c01e8437>]    Not tainted
EFLAGS: 00010086
eax: 0000004a   ebx: f7537800   ecx: c03bbe24   edx: 00002f46
esi: 00000297   edi: f75378f0   ebp: 00000000   esp: f6d59ab8
ds: 0018   es: 0018   ss: 0018
Process dbench (pid: 1253, stackpage=f6d59000)
Stack: c02ee260 0000005a 00000010 00000005 f7537800 f6d59cb0 c01b5974
f7537800
       0000001e ffffffeb 00000000 00000000 f6d59bac f6d59bb0 f6d59bd8
f6d59bb8
       00000000 c01b568b f6bd0610 00000090 00000000 00000000 00000000
00000000
Call Trace: [<c01b5974>] [<c01b568b>] [<c0204c66>] [<c01f989a>] [<c0202f1a>]
   [<c01fa58b>] [<c020461d>] [<c0203960>] [<c02024ce>] [<c01fdb88>]
[<c01f6802>
   [<c0202460>] [<c01fde3b>] [<c0202460>] [<c0203630>] [<c0202460>]
[<c01ff419>
   [<c0145a26>] [<c01453b0>] [<c014561e>] [<c01078ab>]

Code: 0f 0b 5f 58 c6 83 f0 00 00 00 01 56 9d 89 e8 5b 5e 5f 5d c3

Entering kdb (current=0xf6d58000, pid 1253) on processor 1 Oops: invalid
operand
due to oops @ 0xc01e8437
eax = 0x0000004a ebx = 0xf7537800 ecx = 0xc03bbe24 edx = 0x00002f46
esi = 0x00000297 edi = 0xf75378f0 esp = 0xf6d59ab8 eip = 0xc01e8437
ebp = 0x00000000 xss = 0x00000018 xcs = 0x00000010 eflags = 0x00010086
xds = 0xc02e0018 xes = 0x00000018 origeax = 0xffffffff &regs = 0xf6d59a84
[1]kdb> cpu
Currently on cpu 1
Available cpus: 0, 1
[1]kdb> bt
    EBP       EIP         Function(args)
           0xc01e8437 xfs_mod_incore_sb+0x97 (0xf7537800, 0x1e, 0xffffffeb,
0x0)
                               kernel .text 0xc0100000 0xc01e83a0 0xc01e8450
0xf6d59cb0 0xc01b5974 xfs_bmapi+0x634 (0xf6bd0628, 0x90000, 0x0, 0x10000,
0x2)
                               kernel .text 0xc0100000 0xc01b5340 0xc01b6550
           0xc02024ce linvfs_pb_bmap+0x6e (0xf6bd0628, 0xf6d59f4c, 0x2, 0x0,
0x)
                               kernel .text 0xc0100000 0xc0202460 0xc02025b0
           0xc01ff419 linvfs_write+0x2b9 (0xf6d364f4, 0x804bde0, 0xffc3,
0xf6d3)
                               kernel .text 0xc0100000 0xc01ff160 0xc01ff470
           0xc0145a26 sys_write+0x96 (0x7, 0x804bde0, 0xffc3, 0x18, 0x105b)
                               kernel .text 0xc0100000 0xc0145990 0xc0145b50
           0xc01078ab system_call+0x33
                               kernel .text 0xc0100000 0xc0107878 0xc01078b0
[1]kdb> cpu 0

Entering kdb (current=0xf6c48000, pid 1256) on processor 0 due to cpu switch
[0]kdb> bt
    EBP       EIP         Function(args)
           0xc01e8c3a _text_lock_xfs_mount+0x16 (0xf6ba2bf4, 0x30000, 0x0,
0x10)
                               kernel .text 0xc0100000 0xc01e8c24 0xc01e8ca0
           0xc02024ce linvfs_pb_bmap+0x6e (0xf6ba2bf4, 0xf6c49f4c, 0x2, 0x0,
0x)
                               kernel .text 0xc0100000 0xc0202460 0xc02025b0
           0xc01ff419 linvfs_write+0x2b9 (0xf6d369c4, 0x804bde0, 0xf803,
0xf6d3)
                               kernel .text 0xc0100000 0xc01ff160 0xc01ff470
           0xc0145a26 sys_write+0x96 (0x7, 0x804bde0, 0xf803, 0x18, 0x1062)
                               kernel .text 0xc0100000 0xc0145990 0xc0145b50
           0xc01078ab system_call+0x33
                               kernel .text 0xc0100000 0xc0107878 0xc01078b0
[0]kdb>


-------------------- KSYMOOPS OUTPUT -----------------------

And here's the ksymoops output.

ksymoops 2.4.1 on i686 2.4.9-13SGI_XFS_1.0.2smp.  Options used
     -v /usr/src/xfs/linux-2.4-xfs/linux/vmlinux (specified)
     -k /var/log/ksyms.crash (specified)
     -L (specified)
     -o /lib/modules/2.4.18-xfs/ (specified)
     -m /usr/src/xfs/linux-2.4-xfs/linux/System.map (specified)

No modules in ksyms, skipping objects
Warning (compare_maps): mismatch on symbol partition_name  , ksyms_base says
c026ed40, vmlinux says c0173fc0.  Ignoring
ksyms_base entry
Warning (compare_maps): ksyms_base symbol
pci_hp_change_slot_info_R__ver_pci_hp_change_slot_info not found in vmlinux.
Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol
pci_hp_deregister_R__ver_pci_hp_deregister not found in vmlinux.  Ignoring
ksy
ms_base entry
Warning (compare_maps): ksyms_base symbol
pci_hp_register_R__ver_pci_hp_register not found in vmlinux.  Ignoring
ksyms_b
ase entry
CPU:    1
EIP:    0010:[<c01e8437>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010086
eax: 0000004a   ebx: f7537800   ecx: c03bbe24   edx: 00002f46
esi: 00000297   edi: f75378f0   ebp: 00000000   esp: f6d59ab8
ds: 0018   es: 0018   ss: 0018
Process dbench (pid: 1253, stackpage=f6d59000)
Stack: c02ee260 0000005a 00000010 00000005 f7537800 f6d59cb0 c01b5974
f7537800
       0000001e ffffffeb 00000000 00000000 f6d59bac f6d59bb0 f6d59bd8
f6d59bb8
       00000000 c01b568b f6bd0610 00000090 00000000 00000000 00000000
00000000
Call Trace: [<c01b5974>] [<c01b568b>] [<c0204c66>] [<c01f989a>] [<c0202f1a>]
   [<c01fa58b>] [<c020461d>] [<c0203960>] [<c02024ce>] [<c01fdb88>]
[<c01f6802>]
   [<c0202460>] [<c01fde3b>] [<c0202460>] [<c0203630>] [<c0202460>]
[<c01ff419>]
   [<c0145a26>] [<c01453b0>] [<c014561e>] [<c01078ab>]
Code: 0f 0b 5f 58 c6 83 f0 00 00 00 01 56 9d 89 e8 5b 5e 5f 5d c3

>>EIP; c01e8437 <xfs_mod_incore_sb+97/b0>   <=====
Trace; c01b5974 <xfs_bmapi+634/1210>
Trace; c01b568b <xfs_bmapi+34b/1210>
Trace; c0204c66 <xfs_iomap_write_delay+416/590>
Trace; c01f989a <_pagebuf_free_object+15a/1c0>
Trace; c0202f1a <xfs_zero_last_block+5aa/5f0>
Trace; c01fa58b <pagebuf_rele+16b/1c0>
Trace; c020461d <xfs_iomap_write+ad/d0>
Trace; c0203960 <xfs_bmap+120/220>
Trace; c02024ce <linvfs_pb_bmap+6e/150>
Trace; c01fdb88 <_pagebuf_file_write+f8/210>
Trace; c01f6802 <xfs_reclaim+42/1f0>
Trace; c0202460 <linvfs_pb_bmap+0/150>
Trace; c01fde3b <pagebuf_generic_file_write+19b/330>
Trace; c0202460 <linvfs_pb_bmap+0/150>
Trace; c0203630 <xfs_write+290/4a0>
Trace; c0202460 <linvfs_pb_bmap+0/150>
Trace; c01ff419 <linvfs_write+2b9/310>
Trace; c0145a26 <sys_write+96/1c0>
Trace; c01453b0 <generic_file_llseek+0/b0>
Trace; c014561e <sys_lseek+12e/140>
Trace; c01078ab <system_call+33/38>
Code;  c01e8437 <xfs_mod_incore_sb+97/b0>
00000000 <_EIP>:
Code;  c01e8437 <xfs_mod_incore_sb+97/b0>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c01e8439 <xfs_mod_incore_sb+99/b0>
   2:   5f                        pop    %edi
Code;  c01e843a <xfs_mod_incore_sb+9a/b0>
   3:   58                        pop    %eax
Code;  c01e843b <xfs_mod_incore_sb+9b/b0>
   4:   c6 83 f0 00 00 00 01      movb   $0x1,0xf0(%ebx)
Code;  c01e8442 <xfs_mod_incore_sb+a2/b0>
   b:   56                        push   %esi
Code;  c01e8443 <xfs_mod_incore_sb+a3/b0>
   c:   9d                        popf
Code;  c01e8444 <xfs_mod_incore_sb+a4/b0>
   d:   89 e8                     mov    %ebp,%eax
Code;  c01e8446 <xfs_mod_incore_sb+a6/b0>
   f:   5b                        pop    %ebx
Code;  c01e8447 <xfs_mod_incore_sb+a7/b0>
  10:   5e                        pop    %esi
Code;  c01e8448 <xfs_mod_incore_sb+a8/b0>
  11:   5f                        pop    %edi
Code;  c01e8449 <xfs_mod_incore_sb+a9/b0>
  12:   5d                        pop    %ebp
Code;  c01e844a <xfs_mod_incore_sb+aa/b0>
  13:   c3                        ret

Entering kdb (current=0xf6d58000, pid 1253) on processor 1 Oops: invalid
operand
eax = 0x0000004a ebx = 0xf7537800 ecx = 0xc03bbe24 edx = 0x00002f46
esi = 0x00000297 edi = 0xf75378f0 esp = 0xf6d59ab8 eip = 0xc01e8437
ebp = 0x00000000 xss = 0x00000018 xcs = 0x00000010 eflags = 0x00010086

4 warnings issued.  Results may not be reliable.


-----Original Message-----

From: owner-linux-xfs@xxxxxxxxxxx [mailto:owner-linux-xfs@xxxxxxxxxxx]On
Behalf Of Christian Røsnes
Sent: 2. mars 2002 15:49
To: linux-xfs@xxxxxxxxxxx
Subject: Linux 2.4.18 freeze running dbench 1.3


Hello

Everytime I run dbench 1.3 on a linux kernel 2.4.18 from the XFS cvs
checkout,
I'm experiencing complete lockups on Compaq Proliant DL380 G2 servers.
        * Dual cpu 1266 Mhz,
        * SmartArray Raid controller
        * 2 x 36 GB HD in Raid 1
        * 1256 MB RAM
        * eepro100 NIC

(I've installed RedHat 7.2 with the SGI 1.0.2a installer.)

Dbench works fine when running it with the 2.4.9-13SGI_XFS_1.0.2smp kernel
from the SGI installer.

The freeze happens evertime when I start dbench 1.3 with the 2.4.18 kernel:
(I'm running it on my /usr partition which is xfs formatted and mounted)

Eg:
./dbench 10
Starting 10 clients
<system freezes>

The system then becomes unresponsive, and reseting the machine
is the only way to get it back on its feet.
I've tested this on two different DL380 G2s, and it locks up on both.

Is anyone else seeing this ?

Is there any way I can debug this ?
When the system freezes, there are no more entries in /var/log/messages.
There are no Oopses.

I've included my kernel config file (gzipped) and the message output found
in
/var/log/messages (also gzipped).

Any help appreciated. Thank you.

(I do not have physical access to the machines at the moment,
but will on Monday)

Christian

-------------------------- CHECKOUT INFO -------------------------------

I've tried checkouts Thursday,Friday and today - all freeze up
when running 2.4.18 and dbench 1.3:

#!/bin/sh
#
export CVSROOT=':pserver:cvs@xxxxxxxxxxx:/cvs'
cvs login
# FULL CHECKOUT
cvs -z3 checkout linux-2.4-xfs


------------------------------ XFS INFO --------------------------------

Here's the xfs info from the /usr partition in run dbench on.

[root@dl01 xfs]# xfs_info /usr
meta-data=/usr                   isize=256    agcount=14, agsize=262144 blks
data     =                       bsize=4096   blocks=3584280, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=0
naming   =version 2              bsize=4096
log      =internal               bsize=4096   blocks=1200
realtime =none                   extsz=65536  blocks=0, rtextents=0


------------------------------ MOUNT INFO --------------------------------

I've tried to mount /usr with both default logbufs and logbsize values,
and also the parameters listed for /usr below  (logbufs=8,logbsize=32768).
There are dbench lockups in both cases.

[root@dl01 xfs]# mount
/dev/cciss/c0d0p6 on / type xfs (rw)
none on /proc type proc (rw)
/dev/cciss/c0d0p1 on /boot type ext2 (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/cciss/c0d0p7 on /home type xfs (rw,logbufs=8,logbsize=32768)
none on /dev/shm type tmpfs (rw)
/dev/cciss/c0d0p2 on /usr type xfs (rw,logbufs=8,logbsize=32768)
/dev/cciss/c0d0p3 on /var type xfs (rw,logbufs=8,logbsize=32768)



<Prev in Thread] Current Thread [Next in Thread>