xfs
[Top] [All Lists]

Re: [PATCH 0/5] xfs: more patches for 3.13

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: [PATCH 0/5] xfs: more patches for 3.13
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Thu, 14 Nov 2013 12:16:11 +1100
Cc: Ben Myers <bpm@xxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <5282D2D3.3040601@xxxxxxxxxxx>
References: <1383280040-21979-1-git-send-email-david@xxxxxxxxxxxxx> <20131106230133.GX1935@xxxxxxx> <20131107015706.GM6188@dastard> <5282D2D3.3040601@xxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Tue, Nov 12, 2013 at 07:16:03PM -0600, Eric Sandeen wrote:
> On 11/6/13, 7:57 PM, Dave Chinner wrote:
> > On Wed, Nov 06, 2013 at 05:01:33PM -0600, Ben Myers wrote:
> >> On Fri, Nov 01, 2013 at 03:27:15PM +1100, Dave Chinner wrote:
> >>> Hi folks,
> >>>
> >>> The following series follows up the recently committed series of
> >>> patches for 3.13. The first two patches are the remaining
> >>> uncommitted patches from the previous series.
> >>>
> >>> The next two patches are tracing patches, one for AIL manipulations
> >>> and the other for AGF and AGI read operations. Both of these were
> >>> written during recent debugging sessions, and both proved useful so
> >>> should be added to the menagerie of tracepoints we already have
> >>> avaialble.
> >>>
> >>> The final patch is the increasing of the inode cluster size for v5
> >>> filesystems. I'd like to get this into v5 filesystems for 3.13 so we
> >>> get wider exposure of it ASAP so we have more data available to be
> >>> able to make informed decisions about how to bring this back to v4
> >>> filesystems in a safe and controlled manner.
> >>
> >> Applied 3 and 4.  I still don't understand why the locking on patch 2 is
> >> correct.  Seems like the readers of i_version hold different locks than we 
> >> do
> >> when we log the inode.  Maybe Christoph can help me with that.
> > 
> > Readers don't need to hold a spinlock, and many don't. The spinlock
> > is only there to prevent concurrent updates from "losing" an update
> > due to races.  All modifications to XFS inodes occur via
> > transactions, inodes are locked exclusively in transactions and
> > hence we will never lose i_version updates due to races. Hence we
> > don't need the spinlock during the update, either.
> 
> I'm not completely convinced that readers don't need to.  What happens when
> we read in the middle of an update?  Especially when a 32-bit box reads the
> 64-bit value in the middle of an update?
> 
> NFS is the only reader we care about (right?)
> 
> I see a several paths to i_version reads in nfs; so far I'm finding locked 
> reads:
> 
> <2 callers of nfs_refresh_inode_locked>
>       spin_lock(&inode->i_lock);
>       nfs_refresh_inode_locked
>               nfs_update_inode
>                       nfs_wcc_update_inode
>                               (... && inode->i_version == 
> fattr->pre_change_attr)
>               ...
>               if (inode->i_version != fattr->change_attr) {
>               ...
>               nfs_check_inode_attributes
>                       (... && inode->i_version != fattr->change_attr)


That's client side, not server side, so that's the NFS client inode
it is locking, not the server side XFS inode.

> ---
> 
> update_changeattr
>       spin_lock(&dir->i_lock);
>       if (!cinfo->atomic || cinfo->before != dir->i_version)

Again, client side.

> ---
> 
> nfs_post_op_update_inode_force_wcc
>        spin_lock(&inode->i_lock);
>        fattr->pre_change_attr = inode->i_version;

Client side.

> 
> ---
> 
> I haven't audited everything but do you have an example of an unlocked
> reader (which is relevant to xfs)?

Server side, where i_version is pulled out of an XFS inode:

$ git grep i_version fs/nfsd
fs/nfsd/nfs3xdr.c:      fhp->fh_post_change = 
fhp->fh_dentry->d_inode->i_version;
fs/nfsd/nfs4xdr.c:              write64(p, inode->i_version);
fs/nfsd/nfsfh.h:                fhp->fh_pre_change = inode->i_version;
$

the nfsfh.h hit is in fill_pre_wcc(), which appears to be called
under i_mutex but not i_lock. The xdr encoding functions don't
appear to be holding i_lock, and may be holding i_mutex, but I
haven't looked that far.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>