xfs
[Top] [All Lists]

bdflush <defunct>

To: Linux XFS List <linux-xfs@xxxxxxxxxxx>
Subject: bdflush <defunct>
From: Mihai RUSU <dizzy@xxxxxxxxx>
Date: Tue, 18 Jun 2002 21:45:44 +0300 (EEST)
Sender: owner-linux-xfs@xxxxxxxxxxx
Hi

We are using XFS on many production servers here. We have been running
2.4.9-13 XFS 1.0.2 rel till 2.4.9-31 XFS 1.1 rel with no problems untill
now.

Today I have found a strange message in the dmesg output I did ksymoops on
it and this is the output:

ksymoops 2.4.5 on i686 2.4.9-31SGI_XFS_1.1.  Options used
     -v /usr/src/linux/vmlinux (specified)
     -K (specified)
     -l /proc/modules (default)
     -o /lib/modules/2.4.9-31SGI_XFS_1.1/ (default)
     -m /usr/src/linux/System.map (default)

No modules in ksyms, skipping objects
No ksyms, skipping lsmod
invalid operand: 0000
CPU:    0
EIP:    0010:[<c01db07b>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010287
eax: 00000000   ebx: 00000008   ecx: f45cbcc0   edx: f45cbcc0
esi: 00000001   edi: f45cbcc0   ebp: 00001000   esp: f7da7f58
ds: 0018   es: 0018   ss: 0018
Process bdflush (pid: 7, stackpage=f7da7000)
Stack: 00000001 f45cbcc0 c01db2bb 00000001 f45cbcc0 f45cbcc0 0000000d
f6b9e720
       f7da7fbc 00000001 f7da6000 00000286 c02732f8 f7da633a 00000010
c02708e0
       00000000 00000000 f7da7fd4 c013487d 00000001 00000001 f7da7fc4
f45cbcc0
Call Trace: [<c01db2bb>]
[<c013487d>]
[<c0137bca>]
[<c024e8ae>]
[<c0137deb>]
[<c01055b4>]
Code: 0f 0b b8 03 00 00 00 f0 0f ab 42 18 0f b7 42 0c 66 89 42 14


>>EIP; c01db07b <submit_bh+2b/6c>   <=====

>>ecx; f45cbcc0 <END_OF_CODE+3429214c/????>
>>edx; f45cbcc0 <END_OF_CODE+3429214c/????>
>>edi; f45cbcc0 <END_OF_CODE+3429214c/????>
>>ebp; 00001000 Before first symbol
>>esp; f7da7f58 <END_OF_CODE+37a6e3e4/????>

Trace; c01db2bb <ll_rw_block+1ff/270>
Trace; c013487d <write_buffer+6d/7c>
Trace; c0137bca <flush_dirty_buffers+a2/e4>
Trace; c024e8ae <Unused_offset+a6a/451c>
Trace; c0137deb <bdflush+73/b0>
Trace; c01055b4 <kernel_thread+28/38>

Code;  c01db07b <submit_bh+2b/6c>
0000000000000000 <_EIP>:
Code;  c01db07b <submit_bh+2b/6c>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c01db07d <submit_bh+2d/6c>
   2:   b8 03 00 00 00            mov    $0x3,%eax
Code;  c01db082 <submit_bh+32/6c>
   7:   f0 0f ab 42 18            lock bts %eax,0x18(%edx)
Code;  c01db087 <submit_bh+37/6c>
   c:   0f b7 42 0c               movzwl 0xc(%edx),%eax
Code;  c01db08b <submit_bh+3b/6c>
  10:   66 89 42 14               mov    %ax,0x14(%edx)


More info:
Linux version 2.4.9-31SGI_XFS_1.1 (dizzy@us) (gcc version 2.95.3
20010315 (release)) #1 SMP

Its a -custom compiled version. We had it running for 33 days with no
problems at all (until we rebooted for a hw upgrade) on the same machine
doing the same thing.
Please tell me what to do couse I have a bdflush process which shows to be
in a Z(ombie) state and I dont know if this can corrupt my data (althought
the system seems to be running ok).
ps ax| head

  PID TTY      STAT   TIME COMMAND
    1 ?        S      0:07 init [3]
    2 ?        SW     0:00 [keventd]
    3 ?        RWN    0:12 [ksoftirqd_CPU0]
    4 ?        SWN    0:14 [ksoftirqd_CPU1]
    5 ?        SW     6:02 [kswapd]
    6 ?        SW     0:00 [kreclaimd]
    7 ?        Z      9:20 [bdflush <defunct>]
    8 ?        SW    83:42 [kupdated]
    9 ?        SW     0:15 [pagebuf_daemon]

However recently it started to have problems. The first one was a strange
"crash". It responded to ping and trying to connect to its services
returned "connection established" but nothing else worked. Not even a new
SSH session. I even filtered all the traffic to it from a upstream router
and allowed only my machine , didnt got any luck. Trying to put a console
on it didnt worked (the screen remained blank). The second problem was 3
days ago when one of our httpd servers died and every I couldnt restart it
couse the kernel said "address already in use" althought the netstat
command said there is no other process listening there (I even waited for
2 hours). This second problem I have hit it with 2.4.9-13 and 2.4.9-21 too
but very rare (3 months interval).

Thanks

----------------------------
Mihai RUSU

Disclaimer: Any views or opinions presented within this e-mail are solely
those of the author and do not necessarily represent those of any company,
unless otherwise specifically stated.


<Prev in Thread] Current Thread [Next in Thread>
  • bdflush <defunct>, Mihai RUSU <=