xfs-masters
[Top] [All Lists]

[xfs-masters] [Bug 756] New: File data corruption when writing to files

To: xfs-master@xxxxxxxxxxx
Subject: [xfs-masters] [Bug 756] New: File data corruption when writing to files with DM_EVENT_WRITE enabled over NFS (2.4 kernel)
From: bugzilla-daemon@xxxxxxxxxxx
Date: Tue, 1 May 2007 14:04:12 -0700
Reply-to: xfs-masters@xxxxxxxxxxx
Sender: xfs-masters-bounce@xxxxxxxxxxx
http://oss.sgi.com/bugzilla/show_bug.cgi?id=756

           Summary: File data corruption when writing to files with
                    DM_EVENT_WRITE enabled over NFS (2.4 kernel)
           Product: Linux XFS
           Version: Current
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XFS kernel code
        AssignedTo: xfs-master@xxxxxxxxxxx
        ReportedBy: kjamieson@xxxxxxxxxx


This was observed with a fairly old snapshot of the SGI linux-2.4-xfs CVS tree
(2.4.27, taken 2004-11-15), but from code inspection the problem appears to
still exist in the latest linux-2.4-xfs source.

The problem occurs when NFS clients (at least 2.4 and 2.6 Linux clients, we
haven't tested other platforms) submit a set of write requests for a file in
parallel (e.g., write 8K at offset 8K, write 8K at offset 16K), and the server
is running multiple NFS server threads.

If the file has a managed region with write events set, there is a race in
xfs_write() in fs/xfs/linux-2.4/xfs_lrw.c between multiple write requests as
follows:

The file size is read after locking the inode:

        xfs_ilock(xip, XFS_ILOCK_EXCL|iolock);

        isize = xip->i_d.di_size;
        
But the inode is unlocked around the call to send the DMAPI event:

        if ((DM_EVENT_ENABLED(vp->v_vfsp, xip, DM_EVENT_WRITE) &&
            !(ioflags & IO_INVIS) && !eventsent)) {
                ...
                xfs_iunlock(xip, XFS_ILOCK_EXCL);
                error = XFS_SEND_DATA(xip->i_mount, DM_EVENT_WRITE, vp,
                                      *offset, size,
                                      dmflags, &locktype);
                ...
                xfs_ilock(xip, XFS_ILOCK_EXCL);
                ...
        }

So if xip->i_d.di_size changes between the lock being dropped and re-acquired,
the stale value in isize may then cause previously written data to be zeroed:

        if (!(ioflags & IO_ISDIRECT) && (*offset > isize && isize)) {
                error = xfs_zero_eof(BHV_TO_VNODE(bdp), io, *offset,
                        isize, *offset + size);
                ...
        }

For example:

# hexdump -C /mnt/test/file1 
00000000  61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61  |aaaaaaaaaaaaaaaa|
*
00002000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00014000  61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61  |aaaaaaaaaaaaaaaa|
*
00019000


The patch we are using to fix this issue is attached.

Note that a similar issue existed in the 2.6 SGI kernel up until it was resolved
by this recent change:
http://oss.sgi.com/cgi-bin/cvsweb.cgi/linux-2.6-xfs/fs/xfs/linux-2.6/xfs_lrw.c.diff?r1=1.258;r2=1.259;f=h

-- 
Configure bugmail: http://oss.sgi.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


<Prev in Thread] Current Thread [Next in Thread>
  • [xfs-masters] [Bug 756] New: File data corruption when writing to files with DM_EVENT_WRITE enabled over NFS (2.4 kernel), bugzilla-daemon <=