[PATCH v2] xfs: handle dquot buffer readahead in log recovery correctly
Brian Foster
bfoster at redhat.com
Thu Jan 7 06:44:33 CST 2016
On Thu, Jan 07, 2016 at 02:08:30PM +1100, Dave Chinner wrote:
>
> From: Dave Chinner <dchinner at redhat.com>
>
> When we do dquot readahead in log recovery, we do not use a verifier
> as the underlying buffer may not have dquots in it. e.g. the
> allocation operation hasn't yet been replayed. Hence we do not want
> to fail recovery because we detect an operation to be replayed has
> not been run yet. This problem was addressed for inodes in commit
> d891400 ("xfs: inode buffers may not be valid during recovery
> readahead") but the problem was not recognised to exist for dquots
> and their buffers as the dquot readahead did not have a verifier.
>
> The result of not using a verifier is that when the buffer is then
> next read to replay a dquot modification, the dquot buffer verifier
> will only be attached to the buffer if *readahead is not complete*.
> Hence we can read the buffer, replay the dquot changes and then add
> it to the delwri submission list without it having a verifier
> attached to it. This then generates warnings in xfs_buf_ioapply(),
> which catches and warns about this case.
>
> Fix this and make it handle the same readahead verifier error cases
> as for inode buffers by adding a new readahead verifier that has a
> write operation as well as a read operation that marks the buffer as
> not done if any corruption is detected. Also make sure we don't run
> readahead if the dquot buffer has been marked as cancelled by
> recovery.
>
> This will result in readahead either succeeding and the buffer
> having a valid write verifier, or readahead failing and the buffer
> state requiring the subsequent read to resubmit the IO with the new
> verifier. In either case, this will result in the buffer always
> ending up with a valid write verifier on it.
>
> Note: we also need to fix the inode buffer readahead error handling
> to mark the buffer with EIO. Brian noticed the code I copied from
> there wrong during review, so fix it at the same time. Add comments
> linking the two functions that handle readahead verifier errors
> together so we don't forget this behavioural link in future.
>
> cc: <stable at vger.kernel.org> # 3.12 - current
> Signed-off-by: Dave Chinner <dchinner at redhat.com>
> ---
>
> Version 2
> - fix logic error in determining if verify failed
> - set error on buffer when verifier fails
> - fix inode buffer readahead verifier to set error when it fails
> - better comments, link dquot and inode buffer ra verifiers in the
> comments
>
> fs/xfs/libxfs/xfs_dquot_buf.c | 36 ++++++++++++++++++++++++++++++------
> fs/xfs/libxfs/xfs_inode_buf.c | 14 +++++++++-----
> fs/xfs/libxfs/xfs_quota_defs.h | 2 +-
> fs/xfs/libxfs/xfs_shared.h | 1 +
> fs/xfs/xfs_log_recover.c | 9 +++++++--
> 5 files changed, 48 insertions(+), 14 deletions(-)
>
> diff --git a/fs/xfs/libxfs/xfs_dquot_buf.c b/fs/xfs/libxfs/xfs_dquot_buf.c
> index 11cefb2..3cc3cf7 100644
> --- a/fs/xfs/libxfs/xfs_dquot_buf.c
> +++ b/fs/xfs/libxfs/xfs_dquot_buf.c
...
> @@ -264,6 +264,25 @@ xfs_dquot_buf_read_verify(
> }
>
> /*
> + * readahead errors are silent and simply leave the buffer as !done so a real
> + * read will then be run with the xfs_dquot_buf_ops verifier. See
> + * xfs_inode_buf_verify() for why we use EIO and ~XBF_DONE here rather than
> + * reporting the failure.
> + */
> +static void
> +xfs_dquot_buf_readahead_verify(
> + struct xfs_buf *bp)
> +{
> + struct xfs_mount *mp = bp->b_target->bt_mount;
> +
> + if (!xfs_dquot_buf_verify_crc(mp, bp) ||
> + !xfs_dquot_buf_verify(mp, bp, 0)) {
> + xfs_buf_ioerror(bp, -EIO);
> + bp->b_flags &= ~XBF_DONE;
Do we really need to clear the flag if the I/O infrastructure doesn't
set it until after the verifier is invoked? It's harmless, so if the
intent is to just be cautious or future-proof:
Reviewed-by: Brian Foster <bfoster at redhat.com>
... but it does look slightly confusing, without a mention in the
comments at least.
Brian
> + }
> +}
> +
> +/*
> * we don't calculate the CRC here as that is done when the dquot is flushed to
> * the buffer after the update is done. This ensures that the dquot in the
> * buffer always has an up-to-date CRC value.
> @@ -274,7 +293,7 @@ xfs_dquot_buf_write_verify(
> {
> struct xfs_mount *mp = bp->b_target->bt_mount;
>
> - if (!xfs_dquot_buf_verify(mp, bp)) {
> + if (!xfs_dquot_buf_verify(mp, bp, XFS_QMOPT_DOWARN)) {
> xfs_buf_ioerror(bp, -EFSCORRUPTED);
> xfs_verifier_error(bp);
> return;
> @@ -287,3 +306,8 @@ const struct xfs_buf_ops xfs_dquot_buf_ops = {
> .verify_write = xfs_dquot_buf_write_verify,
> };
>
> +const struct xfs_buf_ops xfs_dquot_buf_ra_ops = {
> + .name = "xfs_dquot_ra",
> + .verify_read = xfs_dquot_buf_readahead_verify,
> + .verify_write = xfs_dquot_buf_write_verify,
> +};
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index 1b8d98a..4816209 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -62,11 +62,14 @@ xfs_inobp_check(
> * has not had the inode cores stamped into it. Hence for readahead, the buffer
> * may be potentially invalid.
> *
> - * If the readahead buffer is invalid, we don't want to mark it with an error,
> - * but we do want to clear the DONE status of the buffer so that a followup read
> - * will re-read it from disk. This will ensure that we don't get an unnecessary
> - * warnings during log recovery and we don't get unnecssary panics on debug
> - * kernels.
> + * If the readahead buffer is invalid, we need to mark it with an error and
> + * clear the DONE status of the buffer so that a followup read will re-read it
> + * from disk. We don't report the error otherwise to avoid warnings during log
> + * recovery and we don't get unnecssary panics on debug kernels. We use EIO here
> + * because all we want to do is say readahead failed; there is no-one to report
> + * the error to, so this will distinguish it from a non-ra verifier failure.
> + * Changes to this readahead error behavour also need to be reflected in
> + * xfs_dquot_buf_readahead_verify().
> */
> static void
> xfs_inode_buf_verify(
> @@ -92,6 +95,7 @@ xfs_inode_buf_verify(
> XFS_ERRTAG_ITOBP_INOTOBP,
> XFS_RANDOM_ITOBP_INOTOBP))) {
> if (readahead) {
> + xfs_buf_ioerror(bp, -EIO);
> bp->b_flags &= ~XBF_DONE;
> return;
> }
> diff --git a/fs/xfs/libxfs/xfs_quota_defs.h b/fs/xfs/libxfs/xfs_quota_defs.h
> index 1b0a083..f51078f 100644
> --- a/fs/xfs/libxfs/xfs_quota_defs.h
> +++ b/fs/xfs/libxfs/xfs_quota_defs.h
> @@ -153,7 +153,7 @@ typedef __uint16_t xfs_qwarncnt_t;
> #define XFS_QMOPT_RESBLK_MASK (XFS_QMOPT_RES_REGBLKS | XFS_QMOPT_RES_RTBLKS)
>
> extern int xfs_dqcheck(struct xfs_mount *mp, xfs_disk_dquot_t *ddq,
> - xfs_dqid_t id, uint type, uint flags, char *str);
> + xfs_dqid_t id, uint type, uint flags, const char *str);
> extern int xfs_calc_dquots_per_chunk(unsigned int nbblks);
>
> #endif /* __XFS_QUOTA_H__ */
> diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h
> index 5be5297..15c3ceb 100644
> --- a/fs/xfs/libxfs/xfs_shared.h
> +++ b/fs/xfs/libxfs/xfs_shared.h
> @@ -49,6 +49,7 @@ extern const struct xfs_buf_ops xfs_inobt_buf_ops;
> extern const struct xfs_buf_ops xfs_inode_buf_ops;
> extern const struct xfs_buf_ops xfs_inode_buf_ra_ops;
> extern const struct xfs_buf_ops xfs_dquot_buf_ops;
> +extern const struct xfs_buf_ops xfs_dquot_buf_ra_ops;
> extern const struct xfs_buf_ops xfs_sb_buf_ops;
> extern const struct xfs_buf_ops xfs_sb_quiet_buf_ops;
> extern const struct xfs_buf_ops xfs_symlink_buf_ops;
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 26e67b4..da37beb 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -3521,6 +3521,7 @@ xlog_recover_dquot_ra_pass2(
> struct xfs_disk_dquot *recddq;
> struct xfs_dq_logformat *dq_f;
> uint type;
> + int len;
>
>
> if (mp->m_qflags == 0)
> @@ -3541,8 +3542,12 @@ xlog_recover_dquot_ra_pass2(
> ASSERT(dq_f);
> ASSERT(dq_f->qlf_len == 1);
>
> - xfs_buf_readahead(mp->m_ddev_targp, dq_f->qlf_blkno,
> - XFS_FSB_TO_BB(mp, dq_f->qlf_len), NULL);
> + len = XFS_FSB_TO_BB(mp, dq_f->qlf_len);
> + if (xlog_peek_buffer_cancelled(log, dq_f->qlf_blkno, len, 0))
> + return;
> +
> + xfs_buf_readahead(mp->m_ddev_targp, dq_f->qlf_blkno, len,
> + &xfs_dquot_buf_ra_ops);
> }
>
> STATIC void
>
> _______________________________________________
> xfs mailing list
> xfs at oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
More information about the xfs
mailing list