--- Begin Message ---
Hi,
I have reproduced similar problem on xfs1.3.1 (based on 2.4.21),
my environment is as follows.
nfs server :
OS : RedHat9 + xfs1.3.1 (based on 2.4.21)
CPU : Xeon(2.4GHz) x 2
MEM : 1GB
NIC : Intel PRO/1000
Local Filesystem : XFS, the refcache is disabled.
nfs client :
OS : RadHat9 (based on 2.4.20-8)
NIC : Intel PRO/1000
NFS Ver. : 3
NFS Mount Options : udp,hard,intr,wsize=8192
Within 1 hour of running the test, the corruption was detected.
(to make it easy to detect the corruption, umount nfs, umount xfs,
mount xfs and mount nfs before comparing data, i.e. purge memory cache.)
The corruption width was a multiple of 4KB, starting at 4KB boundary.
In many cases, it was caused in the start part of the physical extent.
I have investigated the issue using the kernel embeded local trace.
I think that the issue was caused by the delayed allocation mechanism.
I explain the example of curruption scenario which I guess.
Each process of the scenario is in order of time.
1. open and write in nfsd (for write1)
The nfs client write 8KB data to file (called write1).
The write request is processed in nfsd. The nfsd call open [linvfs_open],
and call write [linvfs_write]. After calling write, the file has several
delayed allocation blocks over end of the file, by allocation in chunks and
alignment of writeiosize.
file image
offset=0 eof
+----+----+----+----+----+- ... +----+
| | | | | | | |
+----+----+----+----+----+- ... +----+
4KB 4KB
+---------+
write data (write1)
+------------------------------------+
delayed allocation blocks
2. allocate disk space in kupdated (for write1)
The disk space is allocated for delayed allocotion blocks before data
flushed to disk [linvfs_writepage, page_state_convert].
file image
offset=0 eof
+----+----+----+----+----+- ... +----+
| | | | | | | |
+----+----+----+----+----+- ... +----+
4KB 4KB
+---------+
write data (write1)
+------------------------------------+
allocated disk space
+---------+
called disk space1
+--------------------------+
called disk space2
3. close in nfsd (for write1)
The nfsd call close [linvfs_release]. At this time, allocated disk space
over end of the file (disk space2) is truncated, when the refcache is
disabled
[xfs_inactive_free_eofblocks].
file image
offset=0 eof
+----+----+
| | |
+----+----+
4KB 4KB
+---------+
write data (write1)
+---------+
disk space1
4. open and write in nfsd (for write2)
Furthermore the nfs client write 8KB data to file (called write2).
The nfsd call open [linvfs_open], and call write [linvfs_write].
file image
offset=0 eof
+----+----+----+----+----+- ... +----+
| | | | | | | |
+----+----+----+----+----+- ... +----+
4KB 4KB 4KB 4KB
+---------+
write data (write1)
+---------+
write data (write2)
+--------------------------+
delayed allocation blocks
+---------+
disk space1
5. flush data to disk in kupdated (for write1)
The write data (write1) is flushed to disk space1 [page_state_convert].
And the write data (write2) is flushed to disk space2 [cluster_write] !!!,
because the buffer status of write data (write2) is dirty and delay.
But, the disk space2 dose not exist at this time.
The disk space2 may be used by the other file or free space.
I think that one of solution for the issue is to flush only buffers in
end of the file before allocating disk space for delayed allocation blocks,
don't flush buffers over that.
I made patch for xfs1.3.1. I am running the test on the kernel added the
patch, it has been running for over 16 hours with no corruption.
Could you please comment the attached patch.
Regards,
Tsuda
In message "data corruption on nfs+xfs"
(04/05/27 15:58:48),
kazuyuki@xxxxxxxxxxxxxxxxxxx wrote...
>We are experiencing the same problem as No.198.
> http://oss.sgi.com/bugzilla/show_bug.cgi?id=198
> http://marc.theaimsgroup.com/?t=108343605300001&r=1&w=2
>
>We have confirmed that even when the refcache is disabled, setting
>fs.xfs.refcache_size to zero through sysctl, the problem does not disappear.
>To run linux as single CPU mode, it makes the problem slightly hard to occur,
>but it still occurs.
>
>Two types of corruption we've seen:
>
> 1) Width is a multiple of 8kB, starting at 8kB boundary.
> *Maybe the same trouble as No.198.
>
> 2) Width is a 964 bytes, ending up to 4kB boundary.
> *I'm not sure the cause is same as 1) above.
>
>We have tested on 2.4.20-20.9.XFS1.3.1, 2.4.20-30.9.sgi1 XFS1.3.3 and other
>kernels
>based on 2.4.20-20 on which we made some changes.
>
>Anyone who knows where is the cause. On page cache, disk block handling, or
>other parts?
>Or who knows how to avoid this with some setting or another version?
>
xfs1.3.1-delalloc.patch
Description: Binary data
--- End Message ---