------- Additional Comments From Martin@xxxxxxxxxxxx 2006-06-21 00:17 -------
There is some kernel documentation about the write barrier stuff. I have no
depacked kernel 2.6.16 at hand here currently, but you should be able to find
it by using find -name "*barrier*" or grep -ir "barrier" *. That explains the
issue quite nicely as does the SGI FAQ I posted before.
But actually even after I reading it I do not understand this issue completely.
I do not understand why I got three crashes with 2.6.16 in one week while with
2.6.15 it worked quite stable. It was not perfect with 2.6.15, but at least I
only got XFS corruption rarely after a DRI savage driver crash or when suspend
to disk did not work correctly - when the machine was not online as you say.
Actually XFS survived most of those crashes nicely. With 2.6.16 at least once -
when I used kdissert - the kernel just went down while I was using the machine
regurlarily (no 3D stuff and no suspend to disk issues). Even when kdissert /
KDE somehow managed to crash X.org the kernel should still be alive and X.org
should have been restarted. So either kernel 2.6.16 was a lot more unstable
than 2.6.15 in the beginning or XFS had an issue with enabled write cache that
happened while it was running and not only on power outages and kernel crashes.
I had no kernel crashes while regular use with 2.6.16 when I disabled write
cache what may point at the second alternative.
I repaired the filesystem after each event either by using xfs_repair or when
damage was to big by replaying a backup via rsync.
Anyway I think its best to test with 2.6.17 again with barrier functionality
and write cache enabled. I will do so once 2.6.17 matured a bit more and I do
not hear about new issues, cause this is a production machine and I loose quite
some time on each filesystem crash that happens.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.