Russell,
Thanks for the hints but I believe I found a bug in XFS 1.2, and perhaps
you could verify it since I'm not very familiar with the XFS code.
both linvfs_write_full_page and write_full_page will attempt to
unlock_page in some circumstances. The page is previously locked in the
function generic_file_write_nolock. This generic_file_write_nolock
function will unlock the page after it is done using it. Unlocking a
page twice will result in an Oops, so either the mm layer
(generic_file_write_nolock) shouldn't be unlocking the page, or the xfs
code in linvfs_write_full_page and write_full_page shouldn't unlock the
page a second time. The unlock in the xfs code is suspect, since it
only unlocks on certain conditions (which will result in an Oops)
The function linvfs_write_full_page unlocks a page in the case of an
error conditon. I have not been able to hit this error condition, but
if it occurs, an Oops will most certainly occur.
STATIC int
linvfs_write_full_page(
struct page *page)
{
int flagset = 0;
int error;
int need_trans;
int nr_delalloc, nr_unmapped;
if (count_page_state(page, &nr_delalloc, &nr_unmapped)) {
need_trans = nr_delalloc + nr_unmapped;
} else {
need_trans = 1;
}
if ((current->flags & (PF_FSTRANS|PF_NOIO)) && need_trans)
goto out_fail;
if (need_trans) {
current->flags |= PF_NOIO;
flagset = 1;
}
error = write_full_page(page, nr_delalloc);
if (flagset)
current->flags &= ~PF_NOIO;
return error;
out_fail:
SetPageDirty(page);
unlock_page(page);
return 0;
}
The function write_full_page will unlock the page if delalloc_convert
returns a negative value. I don't understand why delalloc_convert
returns a negative value, and if the page should be unlocked, but it
seems that it shouldn't. Could you give me some background on why the
page is unlocked at this point? If I comment out this unlock, bonnie++
does not crash and the filesystem seems still stable, but I'd like to
understand _why_ this condition is occuring and if the unlock_page is
really needed, or if letting the mm handle it is appropriate.
STATIC int
write_full_page(
struct page *page,
int delalloc)
{
struct inode *inode = (struct inode*)page->mapping->host;
unsigned long end_index = inode->i_size >> PAGE_CACHE_SHIFT;
int ret;
/* Are we off the end of the file ? */
if (page->index >= end_index) {
unsigned offset = inode->i_size & (PAGE_CACHE_SIZE-1);
if ((page->index >= end_index+1) || !offset) {
ret = -EIO;
goto out;
}
}
if (!page_has_buffers(page)) {
create_empty_buffers(page, inode->i_dev, 1 << inode->i_blkbits);
}
ret = delalloc_convert(inode, page, 1, delalloc == 0);
out:
if (ret < 0) {
/*
* If it's delalloc and we have nowhere to put it,
* throw it away.
*/
if (delalloc)
block_flushpage(page, 0);
ClearPageUptodate(page);
// unlock_page(page);
}
return ret;
}
It looks like the real problem is that no error return is processed to
not unlock the buffer if an error is returned by linvfs_write_full_page
by the function generic_file_write_nolock in the call chain but the
return values are used to indicate state in some of the calls and cannot
be used as error returns...
The call stack that gets to the point where the page is unlocked is:
generic_file_write_nolock()
generic_commit_write()
__block_commit_write()
balance_dirty()
write_some_buffers()
write_buffer_delay()
linvfs_write_full_page()
write_full_page()
After that the finish of the call generic_file_write_nolock will also
unlock that same page causing an oops.
Thanks!
-steve
Rusell Cattelan wrote:
On Tue, 2003-03-25 at 12:23, Steven Dake wrote:
XFS Developers,
Ok here is what I have done:
I have a tree that already had XFS 1.1 (with a bunch of other stuff)
included. I hand applied the XFS 1.2 on top of XFS 1.1 patch and
everything seems to work fine, except when I run bonnie++ and under load
(Writing intelligently operation), I receive the following Oops:
If possible get kbd into your tree.
Then open a bug and attach what ever backtraces you can get.
bta is a good place to start.
That might shed some light on the situation, but without wading through
the lock changes you might have in your tree this problem may be hard
to track down.
-Russell
|