Segfault of xfs_repair during repair of a xfs filesystem

Subject: Segfault of xfs_repair during repair of a xfs filesystem
From: Rainer Krienke <krienke@xxxxxxxxxxxxxx>
Date: Mon, 5 Jan 2004 08:49:59 +0100
a happy new year to everyone on this list ...

well for some of my xfs filesystems the year had a bad start. Due to a 
powerfail on some systems that possibly rebootet and later again crashed due 
to another powerfail some xfs filesystem were damaged. I could not mount 
theses filesystems and so ran xfs_repair -L.

For one filesystem (~150GB) this worked. xfs_repair reported some errors in 
the filesystem but finished its work. Next I tired to mount this filesystem 
but mount complained that it could not find a valid superblock. So I ran 
xfs_repair once again. It still found some errors (but less than before). 
Next I rebootet the machine and the filesystem was mounted.

Another filesystem on a second machien (~40GB) that strange enough had been 
mounted on reboot but was not accessible (ls /filesystem: input output error) 
could not be repaired by xfs_repair: I unmounted it, started xfs_repair and 
again I had to use -L since after unmounting I was no longer able to mount 
it. xfs_repair reported a lot more errors compared to the first filesystem 
from above and then in pass 4 I think when traversing the directory hierarchy 
from / it segfaultet. A second and third run of xfs_repair produced each time 
more errors but always ended in a segfault. In between I recovered the data 
from tape, but I still have the old broken filesystem for further 
investigation if needed. 

Now I would like to know if this behaviour is "normal":

- Can a filesystem with tansaction logging like xfs become inconsitant because 
of a power fail? There is no disk failure!

- Should xfs_repair find all errors in one run (like a regular fsck does) or 
do I have to run it again and again until it reports no more errors?

- Is it a known issue that xfs_repair seg faults sometimes or is it perhaps a 
problem of my version (see below) ?

- Can I do something else to avoid corrupted xfs filesystems in case of a 

The machine is a suse linux 8.2 multiprocessor  system with SuSE patched 
kernel 2.4.21-144 (from a suse 9.0 system). The xfs kernel driver reports 
version 1.3.1 but I think the filesystems in question were created with an 
earlier kernel I guess with driver version 1.3beta.  xfs_repair has version 
2.5.6. All filesystems are on logical volumes that are handled by lvm2. The 
underlying (lvm) physical volume is a software raid 1, to prevent data loss 
due to a disk failure. 

Thanks for any help 
