Adrian Head wrote:
On Fri, 11 Jan 2002 13:37, Stephen Lord wrote:
Any suggestions as to how the damage was done in the first place -
what the bug is?
Not as yet, but there have now been two cases of filesystems ending up
confusing repair like this.
There have been worrying reports of data corruption over the last few
weeks, and I am sure the developers are trying to track them down.
In my time running Linux/Win? I have had corruption or major issues with most
of the filesystems at some time or other. I still trust XFS as I know where
it falls down for me. Of course YMMV.
My
own system hasn't been used much recently as I have been away on
holiday, but I booted it this morning and everything seemed fine
(running Linux 2.4.17-xfs as downloaded from CVS at the end of last
year).
I would say it is partially as xfs gets more exposure it gets tried on more
combinations of hardware and software (compilers).
This is very true and something that I'm very happy about. The more people
that run XFS the better chance of catching every little issue; which results
in a great stable filesystem very suitable for fileservers. Just look at
where all the operation time has put ext2.
I am not aware of
any 'data corruption' issues remaining (emacs builds should be fixed,
and the fsx-linux program runs fine for me for 24 hours - I believe it
was been run in a patched kernel).
The directory corruption issues do concern me, and I have to admit to being
baffled right now, but the vast majority of users do not see any problems
at all.
Under my normal operational use of XFS, workstations & a 20client fileserver
XFS is as stable as they come. I have not had any issues with these setups
at all. All issues that I have encounted have occured when I have
been stress testing the system way past where it would normally run in an
attempt to find the limitations.
Just an update here, some things which happened overnight:
o The directory corruption which Ralf Bergs has been fighting turns out
to be hardware,
they reproduced corruption under ext2
o xfs_repair has been fixed to deal with the avl insert case
o Finally the thing I forgot to mention, fs corruption which happens
due to hard
machine failure (not a software crash) and is on an IDE drive with
write caching
turned on is to be expected. The write caching will break the
ordering constraints
expected by journalling filesystems, this will probably be true of
ext3, jfs and
reiserfs too.
If you are worried about this sort of corruption then hdparm -W0
/dev/hdX is
your friend. Of course it will also tell you how slow ide drives
really are.
So I think we are down to oopses now.
Steve
|