xfs
[Top] [All Lists]

[Bug 339] New: data loss problem

To: xfs-master@xxxxxxxxxxx
Subject: [Bug 339] New: data loss problem
From: bugzilla-daemon@xxxxxxxxxxx
Date: Sun, 20 Jun 2004 05:15:00 -0700
Sender: linux-xfs-bounce@xxxxxxxxxxx
http://oss.sgi.com/bugzilla/show_bug.cgi?id=339

           Summary: data loss problem
           Product: Linux XFS
           Version: unspecified
          Platform: IA32
        OS/Version: Linux
            Status: NEW
          Severity: major
          Priority: Medium
         Component: XFS kernel code
        AssignedTo: xfs-master@xxxxxxxxxxx
        ReportedBy: tsuda@xxxxxxxxxxxxxx


Hello Russell and XFS people,

I reproduced a data loss problem which was caused by
asynchronous updating inode i_size and xfs_inode di_size.
My environment is as follows.

  OS  : RedHat9 + 2.4.26 + patch for XFS bugzilla #198
  CPU : Pentium4 (2.53GHz)
  MEM : 256MB

I made two simple test programs to reproduce the problem easily.

  Test program 1 (test-enospc-write.c)

    This program writes data to a file until ENOSPC and reads data
    from it to verify written data, repeats these processes infinitely.
    If the program detects bad data, it will put following message
    and stop.

      *** error : bad data image=... offset=...

  Test program 2 (test-enospc-chmod.c)

    This program repeats chmod for the file infinitely.

The problem was able to be reproduced in 5 minutes in my environment
by running these two programs simultaneously.
Procedure is as follows.

  # gcc -o test-enospc-write test-enospc-write.c
?@# gcc -o test-enospc-chmod test-enospc-chmod.c

  # mkfs -t xfs -f -d size=512m /dev/hda9
  # mount -t xfs /dev/hda9 /mnt/xfs

  # ./test-enospc-write /mnt/xfs/file1 &
  # ./test-enospc-chmod /mnt/xfs/file1 &


I have investigated the problem using the kernel embeded local trace.
The problem seems to be caused in the following scenario
(in order of time).

  1. TP1 sleeps in the middle of processing a write request
     After the TP1 processed mKB in a write request, it calls
     balance_dirty() to ease memory overload and sleeps
     to flush dirty buffers (wait resources).
     At this time, xfs_inode di_size is smaller than inode i_size,
     because a_op->write_commit updates only inode i_size.

                         file image
                         offset=0  lKB
                             +--+...+--+--+
                             |  |   |  |  
                             +--+...+--+--+
                                    +-----+
                                      mKB

        inode i_size      :  -------------> (l+mKB)
        xfs_inode di_size :  -------> (lKB)

  2. TP2 revalidate inode i_size
     The TP2 calls vn_revalidate() in linvfs_setattr().
     At this time, inode i_size is changed to same value as
     xfs_inode di_size.
     As result inode i_size is lKB !

                         file image
                         offset=0  lKB
                             +--+...+--+--+
                             |  |   |  |  
                             +--+...+--+--+
                                    +-----+
                                      mKB

        inode i_size      :  -------> (lKB)
        xfs_inode di_size :  -------> (lKB)

  3. TP1 flushs dirty and delayed buffers
     The TP1 wakes up and continues processing a write request.
     The TP1 detects ENOSPC in xfs_iomap_write_delay() and
     it calls xfs_flush_space() to get free space
     by flushing dirty and delayed buffers.

     But flushing buffers processed in current write request fails
     and these buffers are discarded in xfs_page_state_convert(),
     because the position of these buffers is over inode i_size.

  4. TP1 updates both inode i_size and xfs_inode di_size 
     The TP1 updates both inode i_size and xfs_inode di_size
     l+nKB (nKB >= mKB) at the last of processing a write request.
     But several write data are losed.

                         file image
                         offset=0  lKB
                             +--+...+--+--+--+
                             |  |   |  |  |  |
                             +--+...+--+--+--+
                                    +--------+
                                      nKB
                                    +-----+
                                    data loss

        inode i_size      :  -------------------> (l+nKB)
        xfs_inode di_size :  -------------------> (l+nKB)

I made patch for 2.4.26, which simultaneously updates
both inode i_size and xfs_inode di_size at a_op->write_commit.
I am running the test on the kernel added this patch, 
it has been running for over 6 hours with no data loss.

---
 Masanori TSUDA



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


<Prev in Thread] Current Thread [Next in Thread>
  • [Bug 339] New: data loss problem, bugzilla-daemon <=