xfs
[Top] [All Lists]

Re: TAKE 969192: Default mount option "noikeep" makes the inode generati

To: David Chinner <dgc@xxxxxxx>
Subject: Re: TAKE 969192: Default mount option "noikeep" makes the inode generation number non-persistent
From: Mark Goodwin <markgw@xxxxxxx>
Date: Mon, 27 Aug 2007 16:22:14 +1000
Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>, Vlad Apostolov <vapo@xxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: <20070824124933.GS61154114@sgi.com>
Organization: SGI Engineering
References: <46CE581A.2000405@sgi.com> <20070824113631.GA26868@infradead.org> <20070824124933.GS61154114@sgi.com>
Reply-to: markgw@xxxxxxx
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Thunderbird 1.5.0.12 (Windows/20070509)


David Chinner wrote:
On Fri, Aug 24, 2007 at 12:36:31PM +0100, Christoph Hellwig wrote:
On Fri, Aug 24, 2007 at 02:01:30PM +1000, Vlad Apostolov wrote:
To avoid the problem with identical DMAPI handles, the XFSMNT_IDELETE mount
option should be set as default, only if the filesystem is not mounted with
XFSMNT_DMAPI.
Note that we have the same problem with nfs exports aswell.  Dateo maybe we
need a real fix insteead and keep a block of generation numbers around even
if and inode cluster is freed or something similar.

Yes. NFS is less critical than dmapi, though - with NFS filehandles just a change in generation number is usually good enough to catch most stale filehandle issues. With DMAPI, there's applications that record inode number/generation pairs and expect them never to repeat ever again.

We haven't had any reports of probelms with NFS servers due to this,
but as soon as our HSm was exposed to this code we started getting
strange coherency and corruption problems that have taken some time
to track down to this issue. Hence this change seems like the
best tradeoff while we work out a real solution.

At this point I suspect a deleted inode cluster btree in the AGI
is the best solution because it can share most of the btree
code with the current AGI btree and keeps the granularity of
shared generation numbers quite fine.

Having a persistent highest/shared generation number per inode cluster only solves part of the problem - with only 32 bits of precision, eventually it will wrap. Generation numbers need more precision to solve this completely. With more precision, the starting value could simply be based on a timestamp ...

--

 Mark Goodwin                                  markgw@xxxxxxx
 Engineering Manager for XFS and PCP    Phone: +61-3-99631937
 SGI Australian Software Group           Cell: +61-4-18969583
-------------------------------------------------------------


<Prev in Thread] Current Thread [Next in Thread>