[Top] [All Lists]

Re: [PATCH 0/2 v2] Fix O_SYNC AIO DIO

To: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Subject: Re: [PATCH 0/2 v2] Fix O_SYNC AIO DIO
From: Jan Kara <jack@xxxxxxx>
Date: Wed, 4 Sep 2013 12:54:50 +0200
Cc: Jan Kara <jack@xxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, Christoph Hellwig <hch@xxxxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20130830155301.GB13318@xxxxxxxxxxxxxxxxxx>
References: <1376471456-11966-1-git-send-email-jack@xxxxxxx> <20130830155301.GB13318@xxxxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Fri 30-08-13 16:53:01, Al Viro wrote:
> On Wed, Aug 14, 2013 at 11:10:54AM +0200, Jan Kara wrote:
> >   Hello,
> > 
> >   this is second iteration of patches to fix handling of O_SYNC AIO DIO.
> > Since previous version I've addressed Dave's comments:
> >  - slightly expanded changelog of the first patch
> >  - workqueue is now created with parameters allowing paralelism
> >  - workqueue name contains sb->s_id
> >  - workqueue is created on demand (I decided to do this to reduce the 
> > overhead
> >    in unnecessary cases)
> > 
> > The patchset survives xfstests run for ext4 & xfs so it should be sane. 
> > Since
> > this touches several filesystems (although only ext4 & xfs are non-trivial),
> > the question is who should carry these patches. Maybe Al? But since xfs and
> > ext4 changes are non-trivial, I'd like to have a review from their
> > developers...
> Looks sane, except that I'd probably put destroying the queue after
> evict_inodes(), next to ->put_super() call.
  OK, I've changed that. I'll send v3 in a moment.

> Said that, there's another interesting problem in the code affected by that
> sucker: generic_file_aio_write() might very well sync the wrong range.
> Consider O_APPEND case; __generic_file_aio_write() will call
> generic_write_checks(), which will update its copy of pos, and proceed to
> write starting from there.  All right and proper, but then we return into
> generic_file_aio_write() and sync the range of the right length, starting
> at the *original* value of pos...
  Yes, that looks like a bug. I was looking into how we could fix that and
the easiest seems to be to move generic_segment_checks() and
generic_write_checks() from __generic_file_aio_write() to
generic_file_aio_write(). There are only three callers of
__generic_file_aio_write(). cifs_writev() which can and should use
generic_file_aio_write() anyway, ext4_file_dio_write() which could use
generic_file_aio_write() if we cleaned up the code and moved it around a
bit, and blkdev_aio_write() which really needs to call
__generic_file_aio_write() (it doesn't want to grab i_mutex). So that last
caller would need to do the moved checks manually.

But this all seems a bit complex so I'd prefer to do it as a separate

Jan Kara <jack@xxxxxxx>

<Prev in Thread] Current Thread [Next in Thread>
  • Re: [PATCH 0/2 v2] Fix O_SYNC AIO DIO, Jan Kara <=