[PATCH] xfs_bmap_add_extent_delay_real should set br_startoff
bpm at sgi.com
bpm at sgi.com
Fri Jan 14 13:55:05 CST 2011
Hi Dave,
On Fri, Jan 14, 2011 at 10:15:44AM +1100, Dave Chinner wrote:
> On Thu, Jan 13, 2011 at 01:39:28PM -0600, bpm at sgi.com wrote:
> > Hi Dave,
> >
> > On Thu, Jan 13, 2011 at 11:52:07AM +1100, Dave Chinner wrote:
> > > On Wed, Jan 12, 2011 at 01:42:28PM -0600, Ben Myers wrote:
> > > > When filling in the middle of a previous delayed allocation, set
> > > > br_startoff of the new delay extent to the right to NULLSTARTBLOCK
> > > > so that it is ignored by xfs_bmap_extent_to_btree. This prevents
> > > > a forced shutdown when that in-core extent is converted from delay
> > > > to real and is found to be already in the btree. The value is
> > > > overwritten below.
> >
> > You're right. That's not a very good description. I'll see about
> > describing it here and then see about boiling it down for repost.
> >
> > > I'm not sure I understand what the problem is from your description.
> > > What actually goes wrong in xfs_bmap_extent_to_btree()?
> >
> > While testing a related patch whose side effect is to exercise this code
> > more often...
> >
> > I was hitting XFS_WANT_CORRUPTED_GOTOs in xfs_bmap_extent_delay_real in the
> > LEFT_FILLING, RIGHT_FILLING cases where we are going to insert an extent
> > into the btree and we look it up to make sure it isn't already there:
> >
> > From xfs_bmap_extent_delay_real:
> > 911 case BMAP_LEFT_FILLING:
> > 912 /*
> > 913 * Filling in the first part of a previous delayed allocation.
> > 914 * The left neighbor is not contiguous.
> > 915 */
> > 916 trace_xfs_bmap_pre_update(ip, idx, state, _THIS_IP_);
> > 917 xfs_bmbt_set_startoff(ep, new_endoff);
> > 918 temp = PREV.br_blockcount - new->br_blockcount;
> > 919 xfs_bmbt_set_blockcount(ep, temp);
> > 920 xfs_iext_insert(ip, idx, 1, new, state);
> > 921 ip->i_df.if_lastex = idx;
> > 922 ip->i_d.di_nextents++;
> > 923 if (cur == NULL)
> > 924 rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
> > 925 else {
> > 926 rval = XFS_ILOG_CORE;
> > 927 if ((error = xfs_bmbt_lookup_eq(cur, new->br_startoff,
> > 928 new->br_startblock, new->br_blockcount,
> > 929 &i)))
> > 930 goto done;
> > 931 XFS_WANT_CORRUPTED_GOTO(i == 0, done);
> >
> > Forced shutdown at 931.
>
> So it was allocating a different part of the delalloc extent
> that triggered it. OK.
>
> <snip>
>
> > This loop in xfs_bmap_extents_to_btree copies extents from the ifork into
> > the btree:
> >
> > 3242 /*
> > 3243 * Fill in the child block.
> > 3244 */
> > 3245 ablock = XFS_BUF_TO_BLOCK(abp);
> > 3246 ablock->bb_magic = cpu_to_be32(XFS_BMAP_MAGIC);
> > 3247 ablock->bb_level = 0;
> > 3248 ablock->bb_u.l.bb_leftsib = cpu_to_be64(NULLDFSBNO);
> > 3249 ablock->bb_u.l.bb_rightsib = cpu_to_be64(NULLDFSBNO);
> > 3250 arp = XFS_BMBT_REC_ADDR(mp, ablock, 1);
> > 3251 nextents = ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t);
> > 3252 for (cnt = i = 0; i < nextents; i++) {
> > 3253 ep = xfs_iext_get_ext(ifp, i);
> > 3254 if (!isnullstartblock(xfs_bmbt_get_startblock(ep))) {
> >
> > ^ note that it explicitly ignores delay allocation extents by testing for
> > STARTBLOCKMASK in br_startblock.
> >
> > 3255 arp->l0 = cpu_to_be64(ep->l0);
> > 3256 arp->l1 = cpu_to_be64(ep->l1);
> > 3257 arp++; cnt++;
> > 3258 }
> > 3259 }
> >
> > So... when we converted from extents format to btree format we copied
> > extent to the right into the tree because right.br_startblock = 0
> > which is not a nullstartblock. The right extent was actually
> > destined to become delalloc in the ifork but that assignment
> > happens only after we've converted to btree format.
> >
> > The fix works because we set br_startblock to something that includes
> > STARTBLOCKMASK before inserting it into the ifork. This way the test at
> > 3254 will fail for that extent and it won't be inserted into the btree.
> >
> > > I'm assuming
> > > that it counts one too many real extents, and if so, shouldn't it
> > > fire this assert long before you get to a shutdown situation?
> > >
> > > ASSERT(cnt == XFS_IFORK_NEXTENTS(ip, whichfork));
> >
> > It should.
>
> Thanks for confirming that. Might I suggest that running a debug XFS
> build is a good idea purely because it will catch problems like this
> when they happen, rather than when you trip over the result later
> on?
Agreed. I'd prefer to be running debug xfs builds.
> > > Also, do you have a test case that triggers this? If so, can it be
> > > turned into a xfstests case? I like to have some idea of how the
> > > problem was encountered and verified, because this code is complex
> > > and easy to misunderstand...
> >
> > I don't have a test case. Building one might be possible by using extsize
> > to get large delalloc allocations and modifying xfs_iomap_write_allocate to
> > only allocate for the passed in offset and count instead of from passed in
> > iomap...
>
> Hmmm, not really a generic test case :/
>
> I was thinking that we might be able to use sync_file_range() to
> write small ranges of pages inside a delalloc extent. However, it
> does not limit wbc->nr_to_write() so I suspect it will write entire
> delalloc extents at a time rather than small chunks.
If I'm understanding xfs_iomap_write_allocate correctly, it seems that
the code for writing into the middle of a delalloc extent isn't
exercised very often.
> > > > SGI-PV: 1013221
> > >
> > > The following is mostly the notes I wrote to understand what your
> > > patch does. I'm posting them so others don't need to go through the
> > > same analysis to understand the patch. While you might have the
> > > analysis in the above PV we can't see it at all, so it would be
> > > appreciated if you could put a summary of the bug analysis and
> > > test case in the commit message so we don't have to spend a couple of
> > > hours just to work out what the patch does...
> >
> > Here are some traces from the PV:
> >
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.237772] xfs_iext_insert (ffff88018de2c3c0) br_startoff 9240 br_startblock 304078611 br_blockcount 512 br_state 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.237775] xfs_bmbt_set_all set w/ startblock == 0. br_startoff 9752, br_startblock 0 br_blockcount 18408, br_state 0
> > ....
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238578] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 0: startoff 0 startblock NULLSTARTBLOCK(5) blockcount 5 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238583] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 1: startoff 5 startblock 304069376[4:21fbb00] blockcount 512 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238590] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 2: startoff 517 startblock NULLSTARTBLOCK(5) blockcount 2 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238594] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 3: startoff 519 startblock 304069890[4:21fbd02] blockcount 512 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238598] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 4: startoff 1031 startblock NULLSTARTBLOCK(5) blockcount 1 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238602] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 5: startoff 1032 startblock 304070403[4:21fbf03] blockcount 1024 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238606] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 6: startoff 2056 startblock NULLSTARTBLOCK(5) blockcount 3 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238610] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 7: startoff 2059 startblock 304071430[4:21fc306] blockcount 1024 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238614] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 8: startoff 3083 startblock NULLSTARTBLOCK(5) blockcount 3 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238618] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 9: startoff 3086 startblock 304072457[4:21fc709] blockcount 2560 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238623] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 10: startoff 5646 startblock NULLSTARTBLOCK(5) blockcount 1 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238627] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 11: startoff 5647 startblock 304075018[4:21fd10a] blockcount 512 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238631] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 12: startoff 6159 startblock NULLSTARTBLOCK(5) blockcount 4 flag 0
> > Dec 21 13:08:22 gtomds1 kernel: [ 2008.238635] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 13: startoff 6163 startblock 304075534[4:21fd30e] blockcount 1024 flag 0
> > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238639] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 14: startoff 7187 startblock NULLSTARTBLOCK(5) blockcount 2 flag 0
> > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238643] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 15: startoff 7189 startblock 304076560[4:21fd710] blockcount 1024 flag 0
> > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238647] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 16: startoff 8213 startblock NULLSTARTBLOCK(5) blockcount 2 flag 0
> > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238651] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 17: startoff 8215 startblock 304077586[4:21fdb12] blockcount 1024 flag 0
> > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238655] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 18: startoff 9239 startblock NULLSTARTBLOCK(79) blockcount 1 flag 0
> > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238659] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 19: startoff 9240 startblock 304078611[4:21fdf13] blockcount 512 flag 0
> > Dec 21 13:08:23 gtomds1 kernel: [ 2008.238663] xfs_bmap_extents_to_btree (0xffff88018de2c3c0) 20: startoff 9752 startblock 0[0:0] blockcount 18408 flag 0 <---- we added an extent to the btree with startblock=0
> > ....
> > Dec 21 13:08:42 gtomds1 kernel: [ 2009.735411] xfs_bmap_add_extent_delay_real new br_startoff 9752 br_startblock 304079636 br_blockcount1 <---- and it turns out to be the same one we crash on later!
>
> It's this pattern of alloc/delalloc extents that I'm having trouble
> reproducing. The current TOT XFS code simply will not allocate the
> middle part of a delalloc extent like this. What exactly is your test
> doing to produce this sort of pattern? Also, delalloc behaviour has
> changed over different versions, so can you tell me what version of
> XFS you are running to cause this?
IIRC my test case was dd... I think a bug(s) in the patch I
was testing caused the strange allocation patterns. I'll get the test
case for that bug posted shortly.
Thanks,
Ben
More information about the xfs
mailing list