xfs
[Top] [All Lists]

[XFS updates] XFS development tree branch, master, updated. v2.6.37-rc4-

To: xfs@xxxxxxxxxxx
Subject: [XFS updates] XFS development tree branch, master, updated. v2.6.37-rc4-9178-g24446fc
From: xfs@xxxxxxxxxxx
Date: Fri, 28 Jan 2011 16:09:02 -0600
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "XFS development tree".

The branch, master has been updated
  24446fc xfs: xfs_bmap_add_extent_delay_real should init br_startblock
  0fbca4d xfs: fix dquot shaker deadlock
  c6f990d xfs: handle CIl transaction commit failures correctly
  5315837 xfs: limit extsize to size of AGs and/or MAXEXTLEN
  4ce1598 xfs: prevent extsize alignment from exceeding maximum extent size
  14b064c xfs: limit extent length for allocation to AG size
  b8fc826 xfs: speculative delayed allocation uses rounddown_power_of_2 badly
  e34a314 xfs: fix efi item leak on forced shutdown
  7db37c5 xfs: fix log ticket leak on forced shutdown.
      from  c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit 24446fc66fdebbdd8baca0f44fd2a47ad77ba580
Author: bpm@xxxxxxx <bpm@xxxxxxx>
Date:   Wed Jan 19 17:41:58 2011 +0000

    xfs: xfs_bmap_add_extent_delay_real should init br_startblock
    
    When filling in the middle of a previous delayed allocation in
    xfs_bmap_add_extent_delay_real, set br_startblock of the new delay
    extent to the right to nullstartblock instead of 0 before inserting
    the extent into the ifork (xfs_iext_insert), rather than setting
    br_startblock afterward.
    
    Adding the extent into the ifork with br_startblock=0 can lead to
    the extent being copied into the btree by xfs_bmap_extent_to_btree
    if we happen to convert from extents format to btree format before
    updating br_startblock with the correct value.  The unexpected
    addition of this delay extent to the btree can cause subsequent
    XFS_WANT_CORRUPTED_GOTO filesystem shutdown in several
    xfs_bmap_add_extent_delay_real cases where we are converting a delay
    extent to real and unexpectedly find an extent already inserted.
    For example:
    
    911         case BMAP_LEFT_FILLING:
    912                 /*
    913                  * Filling in the first part of a previous delayed 
allocation.
    914                  * The left neighbor is not contiguous.
    915                  */
    916                 trace_xfs_bmap_pre_update(ip, idx, state, _THIS_IP_);
    917                 xfs_bmbt_set_startoff(ep, new_endoff);
    918                 temp = PREV.br_blockcount - new->br_blockcount;
    919                 xfs_bmbt_set_blockcount(ep, temp);
    920                 xfs_iext_insert(ip, idx, 1, new, state);
    921                 ip->i_df.if_lastex = idx;
    922                 ip->i_d.di_nextents++;
    923                 if (cur == NULL)
    924                         rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
    925                 else {
    926                         rval = XFS_ILOG_CORE;
    927                         if ((error = xfs_bmbt_lookup_eq(cur, 
new->br_startoff,
    928                                         new->br_startblock, 
new->br_blockcount,
    929                                         &i)))
    930                                 goto done;
    931                         XFS_WANT_CORRUPTED_GOTO(i == 0, done);
    
    With the bogus extent in the btree we shutdown the filesystem at
    931.  The conversion from extents to btree format happens when the
    number of extents in the inode increases above ip->i_df.if_ext_max.
    xfs_bmap_extent_to_btree copies extents from the ifork into the
    btree, ignoring all delalloc extents which are denoted by
    br_startblock having some value of nullstartblock.
    
    SGI-PV: 1013221
    
    Signed-off-by: Ben Myers <bpm@xxxxxxx>
    Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Signed-off-by: Alex Elder <aelder@xxxxxxx>

commit 0fbca4d1c3932c27c4794bf5c2b5fc961cf5a54f
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Fri Jan 28 11:20:46 2011 +1100

    xfs: fix dquot shaker deadlock
    
    Commit 368e136 ("xfs: remove duplicate code from dquot reclaim") fails
    to unlock the dquot freelist when the number of loop restarts is
    exceeded in xfs_qm_dqreclaim_one(). This causes hangs in memory
    reclaim.
    
    Rework the loop control logic into an unwind stack that all the
    different cases jump into. This means there is only one set of code
    that processes the loop exit criteria, and simplifies the unlocking
    of all the items from different points in the loop. It also fixes a
    double increment of the restart counter from the qi_dqlist_lock
    case.
    
    Reported-by: Malcolm Scott <lkml@xxxxxxxxxxx>
    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>

commit c6f990d1ff8e4e53b12f4175eb7d7ea710c3ca73
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Thu Jan 27 13:23:28 2011 +1100

    xfs: handle CIl transaction commit failures correctly
    
    Failure to commit a transaction into the CIL is not handled
    correctly. This currently can only happen when racing with a
    shutdown and requires an explicit shutdown check, so it rare and can
    be avoided. Remove the shutdown check and make the CIL commit a void
    function to indicate it will always succeed, thereby removing the
    incorrectly handled failure case.
    
    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>

commit 5315837daee7ed76c31ef643915f7d76ef8c1aa3
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Thu Jan 27 12:18:18 2011 +1100

    xfs: limit extsize to size of AGs and/or MAXEXTLEN
    
    The extent size hint can be set to larger than an AG. This means
    that the alignment process can push the range to be allocated
    outside the bounds of the AG, resulting in assert failures or
    corrupted bmbt records. Similarly, if the extsize is larger than the
    maximum extent size supported, the alignment process will produce
    extents that are too large to fit into the bmbt records, resulting
    in a different type of assert/corruption failure.
    
    Fix this by limiting extsize at the time Ñ?t is set firstly to be
    less than MAXEXTLEN, then to be a maximum of half the size of the
    AGs in the filesystem for non-realtime inodes. Realtime inodes do
    not allocate out of AGs, so don't have to be restricted by the size
    of AGs.
    
    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>

commit 4ce159890c00e2cc705e955a939bf1dca7b07ab8
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Thu Jan 27 12:17:58 2011 +1100

    xfs: prevent extsize alignment from exceeding maximum extent size
    
    When doing delayed allocation, if the allocation size is for a
    maximally sized extent, extent size alignment can push it over this
    limit. This results in an assert failure in xfs_bmbt_set_allf() as
    the extent length is too large to find in the extent record.
    
    Fix this by ensuring that we allow for space that extent size
    alignment requires (up to 2 * (extsize -1) blocks as we have to
    handle both head and tail alignment) when limiting the maximum size
    of the extent.
    
    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>

commit 14b064ceaa6f51a7426cc45b4b43685b94380658
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Thu Jan 27 12:16:28 2011 +1100

    xfs: limit extent length for allocation to AG size
    
    Delayed allocation extents can be larger than AGs, so when trying to
    convert a large range we may scan every AG inside
    xfs_bmap_alloc_nullfb() trying to find an AG with a size larger than
    an AG. We should stop when we find the first AG with a maximum
    possible allocation size. This causes excessive CPU usage when there
    are lots of AGs.
    
    The same problem occurs when doing preallocation of a range larger
    than an AG.
    
    Fix the problem by limiting real allocation lengths to the maximum
    that an AG can support. This means if we have empty AGs, we'll stop
    the search at the first of them. If there are no empty AGs, we'll
    still scan them all, but that is a different problem....
    
    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>

commit b8fc82630ae289bb4e661567808afc59e3298dce
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Thu Jan 27 12:14:12 2011 +1100

    xfs: speculative delayed allocation uses rounddown_power_of_2 badly
    
    rounddown_power_of_2() returns an undefined result when passed a
    value of zero. The specualtive delayed allocation code is doing this
    when the inode is zero length. Hence occasionally the preallocation
    is much, much larger than is necessary (e.g. 8GB for a 270 _byte_
    file). Ensure we don't even pass a zero value to this function so
    the result of preallocation is always the desired size.
    
    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>

commit e34a314c5e49fe6b763568f6576b19f1299c33c2
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Thu Jan 27 12:13:35 2011 +1100

    xfs: fix efi item leak on forced shutdown
    
    After test 139, kmemleak shows:
    
    unreferenced object 0xffff880078b405d8 (size 400):
      comm "xfs_io", pid 4904, jiffies 4294909383 (age 1186.728s)
      hex dump (first 32 bytes):
        60 c1 17 79 00 88 ff ff 60 c1 17 79 00 88 ff ff  `..y....`..y....
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
        [<ffffffff81afb04d>] kmemleak_alloc+0x2d/0x60
        [<ffffffff8115c6cf>] kmem_cache_alloc+0x13f/0x2b0
        [<ffffffff814aaa97>] kmem_zone_alloc+0x77/0xf0
        [<ffffffff814aab2e>] kmem_zone_zalloc+0x1e/0x50
        [<ffffffff8147cd6b>] xfs_efi_init+0x4b/0xb0
        [<ffffffff814a4ee8>] xfs_trans_get_efi+0x58/0x90
        [<ffffffff81455fab>] xfs_bmap_finish+0x8b/0x1d0
        [<ffffffff814851b4>] xfs_itruncate_finish+0x2c4/0x5d0
        [<ffffffff814a970f>] xfs_setattr+0x8df/0xa70
        [<ffffffff814b5c7b>] xfs_vn_setattr+0x1b/0x20
        [<ffffffff8117dc00>] notify_change+0x170/0x2e0
        [<ffffffff81163bf6>] do_truncate+0x66/0xa0
        [<ffffffff81163d0b>] sys_ftruncate+0xdb/0xe0
        [<ffffffff8103a002>] system_call_fastpath+0x16/0x1b
        [<ffffffffffffffff>] 0xffffffffffffffff
    
    The cause of the leak is that the "remove" parameter of IOP_UNPIN()
    is never set when a CIL push is aborted. This means that the EFI
    item is never freed if it was in the push being cancelled. The
    problem is specific to delayed logging, but has uncovered a couple
    of problems with the handling of IOP_UNPIN(remove).
    
    Firstly, we cannot safely call xfs_trans_del_item() from IOP_UNPIN()
    in the CIL commit failure path or the iclog write failure path
    because for delayed loging we have no transaction context. Hence we
    must only call xfs_trans_del_item() if the log item being unpinned
    has an active log item descriptor.
    
    Secondly, xfs_trans_uncommit() does not handle log item descriptor
    freeing during the traversal of log items on a transaction. It can
    reference a freed log item descriptor when unpinning an EFI item.
    Hence it needs to use a safe list traversal method to allow items to
    be removed from the transaction during IOP_UNPIN().
    
    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>

commit 7db37c5e6575b229a5051be1d3ef15257ae0ba5d
Author: Dave Chinner <dchinner@xxxxxxxxxx>
Date:   Thu Jan 27 12:02:00 2011 +1100

    xfs: fix log ticket leak on forced shutdown.
    
    The kmemleak detector shows this after test 139:
    
    unreferenced object 0xffff880079b88bb0 (size 264):
      comm "xfs_io", pid 4904, jiffies 4294909382 (age 276.824s)
      hex dump (first 32 bytes):
        00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
        ff ff ff ff ff ff ff ff 48 7b c9 82 ff ff ff ff  ........H{......
      backtrace:
        [<ffffffff81afb04d>] kmemleak_alloc+0x2d/0x60
        [<ffffffff8115c6cf>] kmem_cache_alloc+0x13f/0x2b0
        [<ffffffff814aaa97>] kmem_zone_alloc+0x77/0xf0
        [<ffffffff814aab2e>] kmem_zone_zalloc+0x1e/0x50
        [<ffffffff8148f394>] xlog_ticket_alloc+0x34/0x170
        [<ffffffff81494444>] xlog_cil_push+0xa4/0x3f0
        [<ffffffff81494eca>] xlog_cil_force_lsn+0x15a/0x160
        [<ffffffff814933a5>] _xfs_log_force_lsn+0x75/0x2d0
        [<ffffffff814a264d>] _xfs_trans_commit+0x2bd/0x2f0
        [<ffffffff8148bfdd>] xfs_iomap_write_allocate+0x1ad/0x350
        [<ffffffff814ac17f>] xfs_map_blocks+0x21f/0x370
        [<ffffffff814ad1b7>] xfs_vm_writepage+0x1c7/0x550
        [<ffffffff8112200a>] __writepage+0x1a/0x50
        [<ffffffff81122df2>] write_cache_pages+0x1c2/0x4c0
        [<ffffffff81123117>] generic_writepages+0x27/0x30
        [<ffffffff814aba5d>] xfs_vm_writepages+0x5d/0x80
    
    By inspection, the leak occurs when xlog_write() returns and error
    and we jump to the abort path without dropping the reference on the
    active ticket.
    
    Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Alex Elder <aelder@xxxxxxx>

-----------------------------------------------------------------------

Summary of changes:
 fs/xfs/linux-2.6/xfs_ioctl.c |   20 ++++++++++++-
 fs/xfs/quota/xfs_qm.c        |   46 ++++++++++++++-----------------
 fs/xfs/xfs_alloc.h           |   16 +++++++++++
 fs/xfs/xfs_bmap.c            |   61 +++++++++++++++++++++++++++++++-----------
 fs/xfs/xfs_buf_item.c        |   12 +++++---
 fs/xfs/xfs_extfree_item.c    |    3 +-
 fs/xfs/xfs_iomap.c           |    7 ++++-
 fs/xfs/xfs_log.h             |    2 +-
 fs/xfs/xfs_log_cil.c         |   15 ++++------
 fs/xfs/xfs_trans.c           |   41 ++++++++++++++++++++-------
 10 files changed, 152 insertions(+), 71 deletions(-)


hooks/post-receive
-- 
XFS development tree

<Prev in Thread] Current Thread [Next in Thread>
  • [XFS updates] XFS development tree branch, master, updated. v2.6.37-rc4-9178-g24446fc, xfs <=