[Top] [All Lists]

xfs corruption with XFS_IOC_RESVSP

To: linux-xfs@xxxxxxxxxxx
Subject: xfs corruption with XFS_IOC_RESVSP
From: Miquel van Smoorenburg <miquels@xxxxxxxxxxx>
Date: Thu, 25 Nov 2004 20:15:26 +0000
Sender: linux-xfs-bounce@xxxxxxxxxxx

Let me start off by saying that this isn't easily reproducable - I haven't been able to reproduce it at will just yet.


I have an application that appends slowly and randomly to tens of
thousands of database file, which are later read sequentially. Because
the files are opened, written to (a few hundred bytes) and closed,
all randomly, fragmentation is enormous.

I've been using ioctl(filefd, XFS_IOC_RESVSP, (xfs_flock64_t *)&req) to
preallocate space for these files in 256Kbyte chunks. As soon as a
write to a file would cross a 256K chunk boundary, another 256K is
allocated. Ofcourse this is also done at offset 0.

So we have:

 preallocate 256K (0 .. 256K-1)
 write 700 bytes (file size 700)
 write 800 bytes (file size 1500)
 write 900 bytes (file size 256K - 300)
 preallocate 256K (256K .. 512K-1)
 write 800 bytes (file size 256K + 500)

This actually works as expected and has been running on several machines
for quite some time with a 2.6.9 Linux kernel.

Now I installed this on a Dual Xeon with 4GB memory that has a really
high load (hundreds of simultaneous database connections, 6-disk RAID5
for 100% loaded all of the time). Suddenly, after a week or so of
running, the database files got corrupt - NULs and random binary junk
in the middle of the files.

I wrote a small app to recreate the I/O patterns and sure enough, the
same damage to the files:

        lockf(fd, LOCK);
        pos = lseek(fd, 0, SEEK_END);
        if (pos + num_bytes_to_write_would_cross_modulo_256K_boundary)
                prealloc(another 256K at boundary)
        write(buf, num_bytes_to_write, fd);
        lockf(fd, UNLOCK);

I ran this in 4 processes on the same file and corruption would show up-
usually a bunch of NULs were detected in the file.

On a non-XFS partition (ext3) I didn't see any problems, but ofcourse,
ext3 has no EXT3_IOC_RESVSP ioctl.

After a few days of testing, the server locked up - I rebooted it,
and I haven't been able to repeat the corruption. Probably because
it took a week of running this code in the main database application
for the problem to manifest itself in the first place, and I took the
preallocation code out of the main database application ...

So, I'm not able to reproduce this yet, but I decided to post this here
anyway to see if anyone has an "aha" moment when reading this, and to
have it in the archives in case anyone hits the same problem.

(BTW - the database application I'm talking about is Diablo dreaderd,
and the database is the header database in /news/spool/group/ ).


<Prev in Thread] Current Thread [Next in Thread>