Hi
I am using XFS with Linux 2.6.7 kernel on Redhat 8.0.
xfsprogs version is 2.6.13.
I was running a kind of crash test on an XFS
filesystem to check recovery/corruptions from unclean
shutdowns.
The XFS filesystem sits on 18G partition. The test
worked in the following manner:
20 dirs were created under the FS root directory. A
sample program which repeatedly does a number of
different operations like
create,delete,link,read,write,rename etc was used to
generate FS load. 20 threads of this program were
spawned each working on one of the above 20
directories to generate heavy FS load.
After about 5 minutes from the time the threads were
spawned to build up the load a bit, the machine was
crashed with a direct power-off.
This cycle was repeated for about 200 times.
After about 164 cycles, the filesystem usage reached
100% and further writes failed as expected. I had
logged the dmesg outputs for each reboot cycle
and all of them showed that XFS recovery did not face
any problems. The message seen in each dmesg log was
<snip>
Starting XFS recovery on filesystem: cciss/c0d0p8
(dev: cciss/c0d0p8)
Ending XFS recovery on filesystem: cciss/c0d0p8 (dev:
cciss/c0d0p8
</snip>
Upto this point, everything was fine with XFS
recovering properly after each crash even after the
filesystem was 100% full.
Next, I deleted 10 of the 20 top level directories to
free up some space.
Here, in the "rm -rf" command for one of the
directories, I noticed a hang.
After sometime of inactivity, I rebooted the system (a
clean reboot) and noticed
that XFS recovery failed. The relevant sections of the
boot messages are attached in xfs_bootup_failure.txt
Next, I tried xfs_check. It basically printed a lot of
"block 12/232064 type unknown not expected" messages
and stopped responding too. I noticed a defunct xfs_db
process on the system at this point.
<snip>
[root@mirahp1 root]# ps -aef | grep 1433
root 1433 1377 0 11:39 pts/1 00:00:00
/bin/sh -f /usr/sbin/xfs_check /dev/cciss/c0d0p8
root 1434 1433 0 11:39 pts/1 00:00:01
[xfs_db] <defunct>
</snip>
After this, I tried xfs_repair. Following is the
session trace
<snip>
[root@mirahp1 root]# xfs_repair /dev/cciss/c0d0p8
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
ERROR: The filesystem has valuable metadata changes in
a log which needs to
be replayed. Mount the filesystem to replay the log,
and unmount it before
re-running xfs_repair. If you are unable to mount the
filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption --
please attempt a mount
of the filesystem before doing this.
[root@mirahp1 root]# mount -t xfs /dev/cciss/c0d0p8
/xfs
Segmentation fault
</snip>
xfs_repair with -L also results in a hang after this
point.
Any ideas whats going wrong ?
Basically, its looking like my filesystem is
inaccessible now.
I am unable to mount it or run any repair on it.
Any help will be appreciated.
Thanks,
Ash
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com