On Thu, Jan 29, 2004 at 05:49:07PM -0700, Craig Tierney wrote:
I have just discovered that I am having problems with data corruption
on my NFS servers and XFS. It happens in several different cases, but
all under load. Here are the cases that I have gotten data corruption
for reads and writes. Corruption happens on different servers and
on different filesystems (some configured with LVM striping, some
not).
Can you descibe your test case in more detail? In particular,
do you have a program/programs that demonstrates the problem?
That is always a huge help. Or a list of things to run - what
sort of IO is being done, and what does "under load" mean in
your context.
We tested the new linux-2.4.21 kernel on the dual P3.
"new" and "2.4.21" don't really go together. :)
The file writes are from single processes. Some codes are MPI, but
all the IO, reads and writes, go through the rank 0 node. We can
reproduce the corruption relatively easy when 16 processes are active.
Can you give me a recipe so that I can reproduce it locally?
Does NFS have to be in the picture for this to fail? And is
it reproducible without LVM too?
Russell, does this sound like that NFS corruption that you
were looking into awhile back?