On 1 Feb 2014, at 22:43, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> * ntfs_file_buffered_write() should switch to iov_iter as well. It's
> open-coding a lot of iov_iter stuff.
The NTFS code predates iov_iter by more than 10 years so it simply wasn't there
to use when I wrote NTFS... But yes, NTFS should be updated to use it, I agree
> It's not entirely trivial and
> I'd really like to hear from ntfs folks on that, though, and the current
> code looks deadlock-prone. We prefault everything, then lock the pages
> to which we'll be copying, then attempt to do __copy_from_user_inatomic(),
> falling back to __copy_from_user() if that fails. Fine, but what happens
> if the source of write() is mmaped from the same file and we lose CPU while
> prefaulting the last page, have memory pressure evict the first one, then
> have __ntfs_grab_cache_pages() allocate and lock (nonuptodate) pages
> we'll be copying to and have __copy_from_user() try to copy *from* those
> same pages? We are doing it while holding these pages locked, so pagefault
> will have really fun time with them... Anton?
That would deadlock. You are (of course) quite right. But the generic file
write code used to suffer from the same deadlock problem and no-one ever
complained much about it IIRC...
In the current kernel the problem does not exist any more (due to the do the
atomic copy with pagefaults disabled and then retrying with just the first
segment and keeping retrying till it works) so indeed updating NTFS to use
iov_iter would be a good opportunity to get that fixed in NTFS as well.
It does not matter if for NTFS it turns out to be quite inefficient (due to
locking/unlocking several pages at once) as it is such a crazy corner case that
will hardly ever be triggered in real life especially so as the only time NTFS
operates on more than one page at once in ntfs_file_buffered_write() is when a
write happens into a sparse logical block (NTFS "cluster") and only on volumes
with a logical block size > PAGE_CACHE_SIZE which with a minimum
PAGE_CACHE_SIZE of 4096 bytes on Linux and the fact that Windows only ever
creates volumes with a logical block size of 4096 (unless the user specifically
forces a different block size when creating the volume) it means that it
basically hardly ever happens. Also, the NTFS kernel driver never creates
sparse files, i.e. the sparse file would have had to come from Windows or
another NTFS implementation...
I will have a look at moving NTFS to iov_iter and fixing the potential deadlock
at the same time.
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge
J.J. Thomson Avenue, Cambridge, CB3 0RB, UK