On Tue, Feb 07, 2012 at 01:10:42PM -0500, Christoph Hellwig wrote:
> Timestamps on regular files are the last metadata that XFS does not update
> transactionally. Now that we use the delaylog mode exclusively and made
> the log scode scale extremly well there is no need to bypass that code for
> timestamp updates. Logging all updates allows to drop a lot of code, and
> will allow for further performance improvements later on.
> Note that this patch drops optimized handling of fdatasync - it will be
> added back in a separate commit.
Only thing I see here is that we'll now block reads on a frozen
filesystem. That will bring us in line with ext3/4 behaviour, but
that's probably no bad thing from a consistency/user expectation POV.
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
The code looks ok, but could you elaborate on this:
> --- xfs.orig/fs/xfs/xfs_super.c 2012-02-07 10:01:26.804584821 -0800
> +++ xfs/fs/xfs/xfs_super.c 2012-02-07 10:01:29.214584865 -0800
> @@ -868,91 +868,58 @@ xfs_fs_inode_init_once(
> - * Dirty the XFS inode when mark_inode_dirty_sync() is called so that
> - * we catch unlogged VFS level updates to the inode.
> + * This is called by the VFS when dirtying inode metadata. This can happen
> + * for a few reasons, but we only care about timestamp updates, given that
> + * we handled the rest ourselves. In theory no other calls should happen,
> + * but for example generic_write_end() keeps dirtying the inode after
> + * updating i_size. Thus we check that the flags are exactly I_DIRTY_SYNC,
> + * and skip this call otherwise.
> - * We need the barrier() to maintain correct ordering between unlogged
> - * updates and the transaction commit code that clears the i_update_core
> - * field. This requires all updates to be completed before marking the
> - * inode dirty.
> + * We'll hopefull get a different method just for updating timestamps soon,
> + * at which point this hack can go away, and maybe we'll also get real
> + * error handling here.
Are you looking at adding specific superblock operation for updating
timestamps on an inode? Or something else?
As to fdatasync optimisations, we could just add a new (flags) field
to the inode log item (i.e. separate to the format flags) that is
set here if and only if the inode has not already had it's core
logged and is in the CIL. If subsequent transactions that log the
core clear that flag, then we have a flag that would only be set if
timestamps have been logged by themselves. That would give
xfs_file_fsync some method of determining whether fdatasync
should force the log or not....
> @@ -1471,7 +1438,6 @@ static const struct super_operations xfs
> .alloc_inode = xfs_fs_alloc_inode,
> .destroy_inode = xfs_fs_destroy_inode,
> .dirty_inode = xfs_fs_dirty_inode,
> - .write_inode = xfs_fs_write_inode,
> .evict_inode = xfs_fs_evict_inode,
> .put_super = xfs_fs_put_super,
> .sync_fs = xfs_fs_sync_fs,
Oh, how I like that. ;)