Issues with delalloc->real extent allocation
bpm at sgi.com
bpm at sgi.com
Fri Jan 14 17:50:56 CST 2011
On Fri, Jan 14, 2011 at 03:43:34PM -0600, bpm at sgi.com wrote:
> On Fri, Jan 14, 2011 at 11:29:00AM +1100, Dave Chinner wrote:
> > I've noticed a few suspicious things trying to reproduce the
> > allocate-in-the-middle-of-a-delalloc-extent,
> ...
> > Secondly, I think we have the same expose-the-entire-delalloc-extent
> > -to-stale-data-exposure problem in ->writepage. This onnne, however,
> > is due to using BMAPI_ENTIRE to allocate the entire delalloc extent
> > the first time any part of it is written to. Even if we are only
> > writing a single page (i.e. wbc->nr_to_write = 1) and the delalloc
> > extent covers gigabytes. So, same problem when we crash.
> >
> > Finally, I think the extsize based problem exposed by test 229 is a
> > also a result of allocating space we have no pages covering in the
> > page cache (triggered by BMAPI_ENTIRE allocation) so the allocated
> > space is never zeroed and hence exposes stale data.
>
> This is precisely the bug I was going after when I hit the
> allocate-in-the-middle-of-a-delalloc-extent bug. This is a race between
> block_prepare_write/__xfs_get_blocks and writepage/xfs_page_state
> convert. When xfs_page_state_convert allocates a real extent for a page
> toward the beginning of a delalloc extent, XFS_BMAPI converts the entire
> delalloc extent. Any subsequent writes into the page cache toward the
> end of this freshly allocated extent will see a written extent instead
> of delalloc and read the block from disk into the page before writing
> over it. If the write does not cover the entire page garbage from disk
> will be exposed into the page cache.
Here is a test case to reproduce the corruption. I have only been able
to reproduce it by writing the file on an nfs client served from xfs
that is allocating large delalloc extents.
-Ben
*** the writer
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
int
main(int argc, char *argv[]) {
char *filename = argv[1];
off_t seekdist = 3071; /* less than a page, nice and odd */
off_t max_offset = 1024 * 1024 * 1024; /* 1 gig */
off_t current_offset = 0;
char buf[] = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n";
int fd;
printf("writing to %s\n", filename);
printf("strlen is %d\n", strlen(buf));
fd = open(filename, O_RDWR|O_CREAT, 0644);
if (fd == -1) {
perror(filename);
return -1;
}
while ((current_offset = lseek(fd, seekdist, SEEK_END)) > 0
&& current_offset < max_offset) {
if (write(fd, &buf, strlen(buf)) < strlen(buf)) {
perror("write 'a'");
return -1;
}
}
close(fd);
}
*** the reader
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
int
main(int argc, char *argv[]) {
char *filename = argv[1];
off_t seekdist = 3071; /* less than a page, nice and odd */
off_t max_offset = 1024 * 1024 * 1024; /* 1 gig */
off_t current_offset = 0;
char buf[] = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\n";
char readbuf[4096];
int fd, i;
printf("reading from %s\n", filename);
fd = open(filename, O_RDONLY, 0644);
if (fd == -1) {
perror(filename);
return -1;
}
while (current_offset < max_offset) {
ssize_t nread = read(fd, &readbuf, seekdist);
if (nread != seekdist) {
perror("read nulls");
return -1;
}
for (i=0; i < seekdist; i++) {
if (readbuf[i] != '\0') {
printf("foudn non-null at %d\n%s\n",
current_offset + i,
&readbuf[i]);
break;
// return -1;
}
}
current_offset += nread;
nread = read(fd, &readbuf, strlen(buf));
if (nread != strlen(buf)) {
perror("read a");
return -1;
}
if (strncmp(readbuf, buf, strlen(buf))) {
printf("didn't match at %d\n%s\n",
current_offset + nread,
readbuf);
// return -1;
}
current_offset += nread;
}
close(fd);
}
More information about the xfs
mailing list