[PATCH] xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end
Dave Chinner
david at fromorbit.com
Thu Jan 3 16:02:45 CST 2013
On Thu, Jan 03, 2013 at 03:22:22PM -0600, Ben Myers wrote:
> Dave,
>
> On Wed, Dec 19, 2012 at 09:43:45AM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner at redhat.com>
> >
> > When _xfs_buf_find is passed an out of range address, it will fail
> > to find a relevant struct xfs_perag and oops with a null
> > dereference. This can happen when trying to walk a filesystem with a
> > metadata inode that has a partially corrupted extent map (i.e. the
> > block number returned is corrupt, but is otherwise intact) and we
> > try to read from the corrupted block address.
> >
> > In this case, just fail the lookup. If it is readahead being issued,
> > it will simply not be done, but if it is real read that fails we
> > will get an error being reported. Ideally this case should result
> > in an EFSCORRUPTED error being reported, but we cannot return an
> > error through xfs_buf_read() or xfs_buf_get() so this lookup failure
> > may result in ENOMEM or EIO errors being reported instead.
> >
> > Signed-off-by: Dave Chinner <dchinner at redhat.com>
> > ---
> > fs/xfs/xfs_buf.c | 18 ++++++++++++++++++
> > 1 file changed, 18 insertions(+)
> >
> > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > index a80195b..16249d9 100644
> > --- a/fs/xfs/xfs_buf.c
> > +++ b/fs/xfs/xfs_buf.c
> > @@ -487,6 +487,7 @@ _xfs_buf_find(
> > struct rb_node *parent;
> > xfs_buf_t *bp;
> > xfs_daddr_t blkno = map[0].bm_bn;
> > + xfs_daddr_t eofs;
> > int numblks = 0;
> > int i;
> >
> > @@ -498,6 +499,23 @@ _xfs_buf_find(
> > ASSERT(!(numbytes < (1 << btp->bt_sshift)));
> > ASSERT(!(BBTOB(blkno) & (xfs_off_t)btp->bt_smask));
> >
> > + /*
> > + * Corrupted block numbers can get through to here, unfortunately, so we
> > + * have to check that the buffer falls within the filesystem bounds.
> > + */
> > + eofs = XFS_FSB_TO_BB(btp->bt_mount, btp->bt_mount->m_sb.sb_dblocks);
> > + if (blkno >= eofs || blkno + numblks > eofs) {
> ^^^^^^^^^^^^^^^^^^^^^^
>
> That looks suspect to me. I think you need to go over each buffer
> individually.
I'm not trying to validate every single part of a buffer here -
there is no need to do that as the block numbers are validated
against device overruns during IO. i.e. we'll get an EIO and a log
message telling us an attempt to access beyond the end of the device
occurring during IO.
I.e. we aren't doing validity checks on whether a buffer has a sane
block number or not (that's up to the caller), what we are
avoiding is attempting to look up a buffer that is outside of the
range of the cache indexing. i.e. it's validating the cache index we
are about to use, not passing judgement on whether the caller has
asked for a valid set of blocks or not.
> I bounced it off Mark and this was his suggestion:
>
> for (i = 0; i < nmaps; i++) {
> if (map[i].bm_bn >= eofs ||
> map[i].bm_bn + map[i].bm_len >= eofs)
> ...
Sure, that would work, but we really don't care about the secondary
block numbers here - there are completely unused by the buffer cache
except for when IO is issued. And given that _xfs_buf_find is
probably the hottest function in the XFS code base, avoiding
unnecessary checks is somewhat important...
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
More information about the xfs
mailing list