xfs
[Top] [All Lists]

Re: TAKE - reintroduce the nfs inode reference cache

To: Andi Kleen <ak@xxxxxxx>
Subject: Re: TAKE - reintroduce the nfs inode reference cache
From: Neil Brown <neilb@xxxxxxxxxxxxxxx>
Date: Fri, 9 Mar 2001 12:46:50 +1100 (EST)
Cc: Steve Lord <lord@xxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: message from Andi Kleen on Wednesday March 7
References: <ak@xxxxxxx> <200103062056.f26Kuo106999@xxxxxxxxxxxxxxxxxxxx> <20010306222356.A15312@xxxxxxxxxxxxxxxxxxx> <15013.28348.768980.329307@xxxxxxxxxxxxxxxxxxxxxxxx> <20010307094517.A23323@xxxxxxxxxxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
On Wednesday March 7, ak@xxxxxxx wrote:
> 
> Main problem is that struct file is not preserved over RPCs. Various
> file systems put state into struct file (ext2 readahead context, XFS 
> did discard preallocation in f_op->release, which is rather costly).
> So I was looking for a way to preserve struct file across multiple
> RPCs.

Well.... ext2 doesn't really put any state in the "struct file".  The
readahead context is done by the VFS layer and knfsd makes a point of
copying that out into a cache after a read, and putting it back before
the next read on the same file.
But I can appreciate that it makes sense for filesystems to put
context in the "struct file".  Pre-allocation for writes is an obvious
example (though ext2 seems to put that in the inode).

Certainly caching "struct file"s would be cleaner than the current
code for caching read-ahead context.
A tricky bit is deciding whether a particular IO request on a
particular file should use a cached "struct file" or whether it should
use a new one.

One could imagine a scenario where two clients are sequentially
reading a large file but are up to different parts of the file.  If we
used the same struct file for both clients (as the current read-ahead
cache effectively does) then it will look like random access and no
readahead will happen.

The obvious response to this would be to have the cache keyed not just
on inode, but also on "client" and maybe "offset".

Just adding the "client" to the key wouldn't help if two programs on
the same client were accessing the file.
Adding the "offset" to cope with this would cause problems in the client
sent, for example, three consecutive reads on the same file.
They would be handled by different threads and would run in parallel
and so would probably confuse the read-ahead code (which assumes
sequential reads).

Another problem in using the current offset is that it assumes that
any context that the filesystem attaches to a file handle is only
really useful for sequential accesses.  This is true for readahead and
presumably for preallocation, but is that the only sort of context?

Ofcourse, the mantra of NFS is "file sharing is rare" so we could just
not worry, and that would be no worse than the current situation, and
possibly better.

Another question is "Will holding a struct file for longer cause any
other problems?".

I know that the userspace NFS server caches file descriptors (which
are like struct files) and this has the effect that if you write to an
executable file via NFS, you cannot execute it on the server until the
cache gets flushed, as Linux stops you from executing a file which is
open for write access.

This would not be a problem for the kernel server because
"get_write_access" is separate from holding a struct file, but I would
need to study the code a bit to be sure there are not other similar
problems.


> 
> [This is somewhat similar to another long standing NFS bug BTW -- 
> it shares UDP sockets and there is excessive ARP traffic because the neighbour
> state machine times out all the time and does ARP reprobes]
> 

I don't understand this I'm afraid, but I am curious.  Could you give
a bit more detail?


NeilBrown

<Prev in Thread] Current Thread [Next in Thread>