Looks good.
On Mon, Apr 28, 2008 at 04:20:32PM +1000, David Chinner wrote:
> Don't initialise new inode generation numbers to zero
>
> When we allocation new inode chunks, we initialise the generation
> numbers to zero. This works fine until we delete a chunk and then
> reallocate it, resulting in the same inode numbers but with a
> reset generation count. This can result in inode/generation
> pairs of different inodes occurring relatively close together.
>
> Given that the inode/gen pair makes up the "unique" portion of
> an NFS filehandle on XFS, this can result in file handles cached
> on clients being seen on the wire from the server but refer to
> a different file. This causes .... issues for NFS clients.
>
> Hence we need a unique generation number initialisation for
> each inode to prevent reuse of a small portion of the generation
> number space. Make this initialiser per-allocation group so
> that it is not a single point of contention in the filesystem,
> and increment it on every allocation within an AG to reduce the
> chance that a generation number is reused for a given inode number
> if the inode chunk is deleted and reallocated immediately
> afterwards.
>
> Version 3:
> o use random32 rather than get_random_int() as cryptographically
> secure random numbers are not really necessary here.
>
> Version 2:
> o remove persistent per-AGI agi_newinogen field and replace with
> randomly generated 32 bit number for each new cluster. This prevents
> NFS clients from potentially guessing what the next generation
> number is going to be and removes the need for persistent numbers on
> disk.
>
> Signed-off-by: Dave Chinner <dgc@xxxxxxx>
> ---
> fs/xfs/xfs_ialloc.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> Index: 2.6.x-xfs-new/fs/xfs/xfs_ialloc.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/xfs_ialloc.c 2008-04-28 16:12:57.376445802
> +1000
> +++ 2.6.x-xfs-new/fs/xfs/xfs_ialloc.c 2008-04-28 16:15:04.427919630 +1000
> @@ -147,6 +147,7 @@ xfs_ialloc_ag_alloc(
> int version; /* inode version number to use */
> int isaligned = 0; /* inode allocation at stripe unit */
> /* boundary */
> + unsigned int gen;
>
> args.tp = tp;
> args.mp = tp->t_mountp;
> @@ -290,6 +291,14 @@ xfs_ialloc_ag_alloc(
> else
> version = XFS_DINODE_VERSION_1;
>
> + /*
> + * Seed the new inode cluster with a random generation number. This
> + * prevents short-term reuse of generation numbers if a chunk is
> + * freed and then immediately reallocated. We use random numbers
> + * rather than a linear progression to prevent the next generation
> + * number from easily guessable.
> + */
> + gen = random32();
> for (j = 0; j < nbufs; j++) {
> /*
> * Get the block.
> @@ -309,6 +318,7 @@ xfs_ialloc_ag_alloc(
> free = XFS_MAKE_IPTR(args.mp, fbuf, i);
> free->di_core.di_magic = cpu_to_be16(XFS_DINODE_MAGIC);
> free->di_core.di_version = version;
> + free->di_core.di_gen = cpu_to_be32(gen);
> free->di_next_unlinked = cpu_to_be32(NULLAGINO);
> xfs_ialloc_log_di(tp, fbuf, i,
> XFS_DI_CORE_BITS | XFS_DI_NEXT_UNLINKED);
---end quoted text---
|