xfs-masters
[Top] [All Lists]

[xfs-masters] [Bug 8414] soft lockup and filesystem corruption on XFS wr

To: xfs-masters@xxxxxxxxxxx
Subject: [xfs-masters] [Bug 8414] soft lockup and filesystem corruption on XFS write (by nfsd)
From: bugme-daemon@xxxxxxxxxxxxxxxxxxx
Date: Wed, 2 May 2007 09:20:02 -0700
Reply-to: xfs-masters@xxxxxxxxxxx
Sender: xfs-masters-bounce@xxxxxxxxxxx
http://bugzilla.kernel.org/show_bug.cgi?id=8414





------- Additional Comments From dap@xxxxxxxxxxxxx  2007-05-02 09:20 -------
Thank you for reply!

> The original problem - the soft lockup - is probably a result
> of a massively fragmented file as we are searching the extent
> list when the soft-lockup detector fired. The soft lockup
> detector is not indicative of an actual bug being present, though.
I was waiting about 10 minutes after the soft lockup, but one of the nfsd 
processes consumed 99% cpu for the whole time till I rebooted into 2.6.20.3. 
The nfsd did not reply to any request after the soft lockup. I think it was a 
real infinite loop/deadlock/Idunno.

>> attempt to access beyond end of device  
>> dm-6: rw=1, want=0, limit=2097152000  
>> Buffer I/O error on device dm-6, logical block 18446744073709551615  
>> lost page write due to I/O error on dm-6  
> Shows an attempt to write to a block marked as either a hole or delayed
> allocate. You're not having memory errors are you?
I ran IO stress test scripts today, but no sign of memory/fs corruption or 
instability. I would like to run memtest86+, but as I said it's a productive 
server and there's still 4Tb data on the remained partitions that's served as 
read-only, I cannot shut down this box for memtest for a long time, easily. 
Anyway, I've no reason to suspect bad memory (except that I loosed 1Tb data 
just now :)

> Can you run: 
> # xfs_db -r -c "sb 0" -c p <dev> 
> and
> # dd if=<dev> bs=512 count=1 | od -Ax 
> And attach the output so we can see how badly corrupted the superblock 
> is? 
I know, I did a mistake when I ran the 'xfs_repair -L' without thinking. Lot 
of valuable debug info has been lost. :(  The superblock dump is attached, but 
it's the corrected one, not the corrupted.

> And finally, repair. What problem did you encounter? Did you just end 
> up with everything in lost+found? 
Yes, and I've also got empty directories, that wasn't on the original 
partition, I'm sure. But 'du -hs lost+found' said everything is there.

> Hmmm - this reminds me of problems seen when the filesystem wraps the 
> device at 2TB. How big is the LVM volume this partition is on and where is 
> it placed?

Sulaco:~# vgdisplay -v vg1
  [...]
  Metadata Areas        1
  Metadata Sequence No  5
  [...]
  Total PE              89426
  Alloc PE / Size       89426 / 5.46 TB
  Free  PE / Size       0 / 0

Sulaco:~# lvdisplay -m /dev/vg1/wlv
  [...]
  LV Size                1000.00 GB
  Current LE             16000
  Segments               1
  [...]
  --- Segments ---
  Logical extent 0 to 15999:
    Type                linear
    Physical volume     /dev/md0
    Physical extents    32000 to 47999

Sulaco:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4]
md0 : active raid5 sda1[0] sdg1[15] sdh1[14] sdf1[13] sde1[12] sdp1[11] sdo1
[10] sdn1[9] sdm1[8] sdl1[7] sdk1[6] sdj1[5] sdi1[4] sdd1[3] sdc1[2] sdb1[1]
      5860631040 blocks level 5, 128k chunk, algorithm 2 [16/16] 
[UUUUUUUUUUUUUUUU]
      [=============>.......]  resync = 66.2% (258980992/390708736) 
finish=1451.1min speed=1511K/sec

It's placed middle of the VG and only 1Tb. There's a 2.5Tb partition on this 
box too and there's no problem, I checked all volumes by xfs_check today.

> Along the same trainof thought, I'm wondering if the device wasn't
> reconstructed correctly with the new kernel and that led to the problem.
> Were the any other error messages or warnings in the syslog from the
> boot when the problem first happened? 
No, nothing. The MD was complete and worked for 1 day in the first case, and 
after the reboot there wasn't rebuild.


I'm doing a stress test like this over NFS from morning:

while :; do
  for i in `seq 1 100`; do
    rm $i
    dd if=/dev/zero of=$i bs=32k count=800 &
  done
  for i in `seq 1 100`; do
    if ! md5sum $i | grep -q <md5>; then
       # does not happened, yet
       echo FATAL; exit 1
    fi
  done
done

No errors, yet. It's not "that" partition, but has same size and file count. 
After I mined lost+found for valuable data, I'll start this test on that 
partition too.

I don't know what's the problem, but the similar call traces on soft lockup 
are very suspicious for me. I never seen such message on my servers before, 
and generally there's no mass fragmentation (checked by xfs_bmap) and no too 
high load.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


<Prev in Thread] Current Thread [Next in Thread>