On Thu, Sep 11, 2008 at 06:37:33PM +1000, Lachlan McIlroy wrote:
> If we get an error in xfs_page_state_convert() - and it's not EAGAIN - then
> we throw away the dirty page without converting the delayed allocation. This
> leaves delayed allocations that can never be removed and confuses code that
> expects a flush of the file to clear them. We need to re-dirty the page on
> error so we can try again later or report that the flush failed.
>
> --- a/fs/xfs/linux-2.6/xfs_aops.c 2008-09-11 16:32:11.000000000 +1000
> +++ b/fs/xfs/linux-2.6/xfs_aops.c 2008-09-11 15:44:09.000000000 +1000
> @@ -1147,16 +1147,6 @@ error:
> if (iohead)
> xfs_cancel_ioend(iohead);
>
> - /*
> - * If it's delalloc and we have nowhere to put it,
> - * throw it away, unless the lower layers told
> - * us to try again.
> - */
> - if (err != -EAGAIN) {
> - if (!unmapped)
> - block_invalidatepage(page, 0);
> - ClearPageUptodate(page);
> - }
> return err;
> }
So we keep dirty pages around that we can't write back?
If we are in a low memory situation and the block device
has gone bad, that will prevent memory reclaim from making
progress.
i.e. if we have a bad disk, a user can now take down the system
by running it out of clean memory....
The EAGAIN case is for when we can't get the locks we
need during non-blocking writeback, which is a common case if
there is concurrent writes to this inode....
> @@ -1216,8 +1206,11 @@ xfs_vm_writepage(
> * then mark the page dirty again and leave the page
> * as is.
> */
> - if (current_test_flags(PF_FSTRANS) && need_trans)
> - goto out_fail;
> + if (current_test_flags(PF_FSTRANS) && need_trans) {
> + redirty_page_for_writepage(wbc, page);
> + unlock_page(page);
> + return -EAGAIN;
> + }
Should not return an error here - the redirty_page_for_writepage()
call effective says "can't do this right away, but no error
needs to be reported because it can be written again later".
Happens all the time with non-blocking writes.
> @@ -1231,20 +1224,14 @@ xfs_vm_writepage(
> * to real space and flush out to disk.
> */
> error = xfs_page_state_convert(inode, page, wbc, 1, unmapped);
> - if (error == -EAGAIN)
> - goto out_fail;
> - if (unlikely(error < 0))
> - goto out_unlock;
>
> - return 0;
> + if (error < 0) {
> + redirty_page_for_writepage(wbc, page);
> + unlock_page(page);
> + return error;
> + }
That needs the EAGAIN exception - that's not an error.
For EIO, though, we should really be invalidating the
page like we currently do. However, it should be silent as
per the current behaviour - a rate-limited log warning is
really needed here...
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
|