xfs
[Top] [All Lists]

Re: page fault scalability (ext3, ext4, xfs)

To: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
Subject: Re: page fault scalability (ext3, ext4, xfs)
From: David Lang <david@xxxxxxx>
Date: Wed, 14 Aug 2013 23:18:01 -0700 (PDT)
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, "Theodore Ts'o" <tytso@xxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, Linux FS Devel <linux-fsdevel@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, "linux-ext4@xxxxxxxxxxxxxxx" <linux-ext4@xxxxxxxxxxxxxxx>, Jan Kara <jack@xxxxxxx>, LKML <linux-kernel@xxxxxxxxxxxxxxx>, Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>, Andi Kleen <ak@xxxxxxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <CALCETrUF+dGhE3qv4LoYmc7A=a+ry93u-d-GgHSAwHXvYN+VNw@xxxxxxxxxxxxxx>
References: <520BB9EF.5020308@xxxxxxxxxxxxxxx> <20130814194359.GA22316@xxxxxxxxx> <520BED7A.4000903@xxxxxxxxx> <20130814230648.GD22316@xxxxxxxxx> <CALCETrVaRQ3WQ5++Uu_0JTaVnjUugAaAhqQK__7r5YWvLxpAhw@xxxxxxxxxxxxxx> <20130815011101.GA3572@xxxxxxxxx> <20130815021028.GM6023@dastard> <CALCETrUfuzgG9U=+eSzCGvbCx-ZskWw+MhQ-qmEyWZK=XWNVmg@xxxxxxxxxxxxxx> <20130815060149.GP6023@dastard> <CALCETrUF+dGhE3qv4LoYmc7A=a+ry93u-d-GgHSAwHXvYN+VNw@xxxxxxxxxxxxxx>
User-agent: Alpine 2.02 (DEB 1266 2009-07-14)
On Wed, 14 Aug 2013, Andy Lutomirski wrote:

The big problem with this approach is that not doing the
timestamp update on page faults is going to break the inode change
version counting because for ext4, btrfs and XFS it takes a
transaction to bump that counter. NFS needs to know the moment a
file is changed in memory, not when it is written to disk. Also, NFS
requires the change to the counter to be persistent over server
failures, so it needs to be changed as part of a transaction....

NFS can do whatever it wants, although I suspect that even NFS can get
away with deferring cmtime updates.

NFS already has to do syncs to make sure the data is safe on disk, have a flag that NFS can use to make the ctime safe, everyone else can get the performance improvement and NFS can have it's slow-but-safe approach.

David Lang

<Prev in Thread] Current Thread [Next in Thread>