Russell,
Thanks for all the help, I believe I finally found the issue.
I ensured that any page locked in generic_file_write_nolock was never
used in write_full_page. I used xfs 1.2 / kernel 2.4.19 /
e1000/qla2300. In 2.4.18 / xfs1.2 , the page locked was reused,
resulting in the potential that one page could potentially be unlocked
twice if the page was past the end of the inode.
After doing all that, I went through and rediffed the sources and found
the change between 2.4.18 and 2.4.19 which apprently allows xfs to work
on 2.4.18 as with 2.4.19. With the change, my 2.4.18 kernel doesn't
attempt to unlock the page twice. That patch is attached.
Thanks
-steve
Rusell Cattelan wrote:
On Wed, 2003-03-26 at 11:22, Steven Dake wrote:
Russell thanks for the responses and questions below.
Russell Cattelan wrote:
Ok I've looked over the code.
write_full_page is modeled after block_write_full_page.
Is block_write_full_page something new in XFS CVS? I don't see it in 1.2...
block_write_full_page is the generic write_full_page use by most file
systems.
eg
static int ext2_writepage(struct page *page)
{
return block_write_full_page(page,ext2_get_block);
}
The check for writing beyond the end of file is apparently taken from
there.
/* easy case */
if (page->index < end_index)
return __block_write_full_page(inode, page, get_block);
/* things got complicated... */
offset = inode->i_size & (PAGE_CACHE_SIZE-1);
/* OK, are we completely out? */
if (page->index >= end_index+1 || !offset) {
UnlockPage(page);
return -EIO;
}
The past the end of file is the same check used in that function.
Basically the offset var used to see if size of the file is part way
into a page so if the size of the file lands on a page boundry offset
will be 0.
I've been looking at the code and I think it's correct to error
in the case where page->index == i_size (unit converted ofcourse)
That would be the page just past the end if file.
I understand that the code is doing that, but I was wondering why XFS
should care that write_full_page should fail if the file size is on a
page boundary. If the file size is on a page boundary, shouldn't XFS be
able to write the page just fine?
As far as the unlock_page in the error case ... again based on
block_write_full_page
it seems correct to unlock the page in case of an error.
If you unlock it here, you must not unlock it in
generic_file_write_nolock. It can only be unlocked once... So the code
should pick where it shall unlock. Since it will be unlocked later in
generic_file_write_nolock, I don't understand the need to unlock it in
write_full_page at all. The only case I can think of unlocking the
page, is when it is _not_ called from generic_file_write_nolock.
Perhaps this is happening somewhere and that is the need for the
unlock, but if that is the case, generic_file_write_nolock must not
unlock the page a second time... Tthe error code for write_full_page is
never passed up to generic_file_write_nolock, so
generic_file_write_nolock is never aware that it should _not_ unlock the
page because the I/O had a previous error
Yes I agree the unlock path does looks suspicious.
I found 5 spots in the kernel the calls writepage only one
of them filemap_fdatasync actually checks the error code and
it even that doesn't change if the page is unlocked or not.
So yes I think the unlock_page is in error in write_full_page.... and
in block_write_full_page for that matter.
Stick a BUG() in the check for past eof ... the root of the problem seem
to be somebody trying to write past eof.
--- buffer.c.broken Wed Mar 26 11:52:22 2003
+++ buffer.c Wed Mar 26 15:09:00 2003
@@ -127,11 +127,11 @@
static inline int
write_buffer_delay(struct buffer_head *bh)
{
struct page *page = bh->b_page;
- if (TryLockPage(page)) {
+ if (!TryLockPage(page)) {
spin_unlock(&lru_list_lock);
unlock_buffer(bh);
page->mapping->a_ops->writepage(page);
return 1;
}
|