[Top] [All Lists]

Re: page fault scalability (ext3, ext4, xfs)

To: David Lang <david@xxxxxxx>
Subject: Re: page fault scalability (ext3, ext4, xfs)
From: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
Date: Wed, 14 Aug 2013 23:28:40 -0700
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, "Theodore Ts'o" <tytso@xxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, Linux FS Devel <linux-fsdevel@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, "linux-ext4@xxxxxxxxxxxxxxx" <linux-ext4@xxxxxxxxxxxxxxx>, Jan Kara <jack@xxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>, Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>, Andi Kleen <ak@xxxxxxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <alpine.DEB.2.02.1308142316220.31947@xxxxxxxxxxxxxx>
References: <520BB9EF.5020308@xxxxxxxxxxxxxxx> <20130814194359.GA22316@xxxxxxxxx> <520BED7A.4000903@xxxxxxxxx> <20130814230648.GD22316@xxxxxxxxx> <CALCETrVaRQ3WQ5++Uu_0JTaVnjUugAaAhqQK__7r5YWvLxpAhw@xxxxxxxxxxxxxx> <20130815011101.GA3572@xxxxxxxxx> <20130815021028.GM6023@dastard> <CALCETrUfuzgG9U=+eSzCGvbCx-ZskWw+MhQ-qmEyWZK=XWNVmg@xxxxxxxxxxxxxx> <20130815060149.GP6023@dastard> <CALCETrUF+dGhE3qv4LoYmc7A=a+ry93u-d-GgHSAwHXvYN+VNw@xxxxxxxxxxxxxx> <alpine.DEB.2.02.1308142316220.31947@xxxxxxxxxxxxxx>
On Wed, Aug 14, 2013 at 11:18 PM, David Lang <david@xxxxxxx> wrote:
> On Wed, 14 Aug 2013, Andy Lutomirski wrote:
>>> The big problem with this approach is that not doing the
>>> timestamp update on page faults is going to break the inode change
>>> version counting because for ext4, btrfs and XFS it takes a
>>> transaction to bump that counter. NFS needs to know the moment a
>>> file is changed in memory, not when it is written to disk. Also, NFS
>>> requires the change to the counter to be persistent over server
>>> failures, so it needs to be changed as part of a transaction....
>> NFS can do whatever it wants, although I suspect that even NFS can get
>> away with deferring cmtime updates.
> NFS already has to do syncs to make sure the data is safe on disk, have a
> flag that NFS can use to make the ctime safe, everyone else can get the
> performance improvement and NFS can have it's slow-but-safe approach.

I don't see the current code that updates times for NFS.  I'm not
planning on making any changes that'll affect NFS at all (i.e. I don't
think any flag will be needed), but I'd be more confident if I
understand why it worked in the first place.

(For filesystems that provide page_mkwrite, there hasn't been a
file_update_time call in the core code for several kernel versions.)


<Prev in Thread] Current Thread [Next in Thread>