[PATCH] xfs: quiesce the filesystem after recovery on readonly mount
Eric Sandeen
sandeen at sandeen.net
Thu Sep 22 20:39:58 CDT 2016
On 9/22/16 7:11 PM, Dave Chinner wrote:
> From: Dave Chinner <dchinner at redhat.com>
>
> Recently we've had a number of reports where log recovery on a v5
> filesystem has reported corruptions that looked to be caused by
> recovery being re-run over the top of an already-recovered
> metadata. This has uncovered a bug in recovery (fixed elsewhere)
> but the vector that caused this was largely unknown.
>
> A kdump test started tripping over this problem - the system
> would be crashed, the kdump kernel and environment would boot and
> dump the kernel core image, and then the system would reboot. After
> reboot, the root filesystem was triggering log recovery and
> corruptions were being detected. The metadumps indicated the above
> log recovery issue.
>
> What is happening is that the kdump kernel and environment is
> mounting the root device read-only to find the binaries needed to do
> it's work. The result of this is that it is running log recovery.
> However, because there were unlinked files and EFIs to be processed
> by recovery, the completion of phase 1 of log recovery could not
> mark the log clean. And because it's a read-only mount, the unmount
> process does not write records to the log to mark it clean, either.
> Hence on the next mount of the filesystem, log recovery was run
> again across all the metadata that had already been recovered and
> this is what triggered corruption warnings.
>
> To avoid this problem, we need to ensure that a read-only mount
> always updates the log when it completes the second phase of
> recovery. We already handle this sort of issue with rw->ro remount
> transitions, so the solution is as simple as quiescing the
> filesystem at the appropriate time during the mount process. This
> results in the log being marked clean so the mount behaviour
> recorded in the logs on repeated RO mounts will change (i.e. log
> recovery will no longer be run on every mount until a RW mount is
> done). This is a user visible change in behaviour, but it is
> harmless.
Excellent idea :)
Reviewed-by: Eric Sandeen <sandeen at redhat.com>
> Signed-off-by: Dave Chinner <dchinner at redhat.com>
> ---
> fs/xfs/xfs_mount.c | 14 ++++++++++++++
> fs/xfs/xfs_super.c | 2 +-
> fs/xfs/xfs_super.h | 1 +
> 3 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index faeead6..56e85a6 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -934,6 +934,20 @@ xfs_mountfs(
> }
>
> /*
> + * Now the log is fully replayed, we can transition to full read-only
> + * mode for read-only mounts. This will sync all the metadata and clean
> + * the log so that the recovery we just performed does not have to be
> + * replayed again on the next mount.
> + *
> + * We use the same quiesce mechanism as the rw->ro remount, as they are
> + * semantically identical operations.
> + */
> + if ((mp->m_flags & (XFS_MOUNT_RDONLY|XFS_MOUNT_NORECOVERY)) ==
> + XFS_MOUNT_RDONLY) {
> + xfs_quiesce_attr(mp);
> + }
> +
> + /*
> * Complete the quota initialisation, post-log-replay component.
> */
> if (quotamount) {
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 3409753..2d092f9 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1137,7 +1137,7 @@ xfs_restore_resvblks(struct xfs_mount *mp)
> * Note: xfs_log_quiesce() stops background log work - the callers must ensure
> * it is started again when appropriate.
> */
> -static void
> +void
> xfs_quiesce_attr(
> struct xfs_mount *mp)
> {
> diff --git a/fs/xfs/xfs_super.h b/fs/xfs/xfs_super.h
> index 529bce9..b6418ab 100644
> --- a/fs/xfs/xfs_super.h
> +++ b/fs/xfs/xfs_super.h
> @@ -61,6 +61,7 @@ struct xfs_mount;
> struct xfs_buftarg;
> struct block_device;
>
> +extern void xfs_quiesce_attr(struct xfs_mount *mp);
> extern void xfs_flush_inodes(struct xfs_mount *mp);
> extern void xfs_blkdev_issue_flush(struct xfs_buftarg *);
> extern xfs_agnumber_t xfs_set_inode_alloc(struct xfs_mount *,
>
More information about the xfs
mailing list