xfs
[Top] [All Lists]

Re: [PATCH 1/2] xfs: clean up inode lockdep annotations

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH 1/2] xfs: clean up inode lockdep annotations
From: Brian Foster <bfoster@xxxxxxxxxx>
Date: Fri, 14 Aug 2015 15:41:26 -0400
Cc: xfs@xxxxxxxxxxx, jack@xxxxxxx, oleg@xxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <1439330648-1057-2-git-send-email-david@xxxxxxxxxxxxx>
References: <1439330648-1057-1-git-send-email-david@xxxxxxxxxxxxx> <1439330648-1057-2-git-send-email-david@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.23 (2014-03-12)
On Wed, Aug 12, 2015 at 08:04:07AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
> 
> Lockdep annotations are a maintenance nightmare. Locking has to be
> modified to suit the limitations of the annotations, and we're
> always having to fix the annotations because they are unable to
> express the complexity of locking heirarchies correctly.
> 
> So, next up, we've got more issues with lockdep annotations for
> inode locking w.r.t. XFS_LOCK_PARENT:
> 
>       - lockdep classes are exclusive and can't be ORed together
>         to form new classes.
>       - IOLOCK needs multiple PARENT subclasses to express the
>         changes needed for the readdir locking rework needed to
>         stop the endless flow of lockdep false positives involving
>         readdir calling filldir under the ILOCK.
>       - there are only 8 unique lockdep subclasses available,
>         so we can't create a generic solution.
> 
> IOWs we need to treat the 3-bit space available to each lock type
> differently:
> 
>       - IOLOCK uses xfs_lock_two_inodes(), so needs:
>               - at least 2 IOLOCK subclasses
>               - at least 2 IOLOCK_PARENT subclasses
>       - MMAPLOCK uses xfs_lock_two_inodes(), so needs:
>               - at least 2 MMAPLOCK subclasses
>       - ILOCK uses xfs_lock_inodes with up to 5 inodes, so needs:
>               - at least 5 ILOCK subclasses
>               - one ILOCK_PARENT subclass
>               - one RTBITMAP subclass
>               - one RTSUM subclass
> 
> For the IOLOCK, split the space into two sets of subclasses.
> For the MMAPLOCK, just use half the space for the one subclass to
> match the non-parent lock classes of the IOLOCK.
> For the ILOCK, use 0-4 as the ILOCK subclasses, 5-7 for the
> remaining individual subclasses.
> 
> Because they are now all different, modify xfs_lock_inumorder() to
> handle the nested subclasses, and to assert fail if passed an
> invalid subclass. Further, annotate xfs_lock_inodes() to assert fail
> if an invalid combination of lock primitives and inode counts are
> passed that would result in a lockdep subclass annotation overflow.
> 
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> ---

A few aesthetic things...

>  fs/xfs/xfs_inode.c | 68 ++++++++++++++++++++++++++++++++-----------
>  fs/xfs/xfs_inode.h | 85 
> +++++++++++++++++++++++++++++++++++++-----------------
>  2 files changed, 110 insertions(+), 43 deletions(-)
> 
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 8d0cbb5..793e6c9 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -164,7 +164,7 @@ xfs_ilock(
>              (XFS_MMAPLOCK_SHARED | XFS_MMAPLOCK_EXCL));
>       ASSERT((lock_flags & (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL)) !=
>              (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL));
> -     ASSERT((lock_flags & ~(XFS_LOCK_MASK | XFS_LOCK_DEP_MASK)) == 0);
> +     ASSERT((lock_flags & ~(XFS_LOCK_MASK | XFS_LOCK_SUBCLASS_MASK)) == 0);
>  
>       if (lock_flags & XFS_IOLOCK_EXCL)
>               mrupdate_nested(&ip->i_iolock, XFS_IOLOCK_DEP(lock_flags));
> @@ -212,7 +212,7 @@ xfs_ilock_nowait(
>              (XFS_MMAPLOCK_SHARED | XFS_MMAPLOCK_EXCL));
>       ASSERT((lock_flags & (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL)) !=
>              (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL));
> -     ASSERT((lock_flags & ~(XFS_LOCK_MASK | XFS_LOCK_DEP_MASK)) == 0);
> +     ASSERT((lock_flags & ~(XFS_LOCK_MASK | XFS_LOCK_SUBCLASS_MASK)) == 0);
>  
>       if (lock_flags & XFS_IOLOCK_EXCL) {
>               if (!mrtryupdate(&ip->i_iolock))
> @@ -281,7 +281,7 @@ xfs_iunlock(
>              (XFS_MMAPLOCK_SHARED | XFS_MMAPLOCK_EXCL));
>       ASSERT((lock_flags & (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL)) !=
>              (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL));
> -     ASSERT((lock_flags & ~(XFS_LOCK_MASK | XFS_LOCK_DEP_MASK)) == 0);
> +     ASSERT((lock_flags & ~(XFS_LOCK_MASK | XFS_LOCK_SUBCLASS_MASK)) == 0);
>       ASSERT(lock_flags != 0);
>  
>       if (lock_flags & XFS_IOLOCK_EXCL)
> @@ -364,30 +364,38 @@ int xfs_lock_delays;
>  
>  /*
>   * Bump the subclass so xfs_lock_inodes() acquires each lock with a different
> - * value. This shouldn't be called for page fault locking, but we also need 
> to
> - * ensure we don't overrun the number of lockdep subclasses for the iolock or
> - * mmaplock as that is limited to 12 by the mmap lock lockdep annotations.
> + * value. This can be called for any type of inode lock combination, 
> including
> + * parent locking. Care must be taken to ensure we don't overrun the subclass
> + * storage fields in the class mask we build.
>   */
>  static inline int
>  xfs_lock_inumorder(int lock_mode, int subclass)
>  {
> +     int     class = 0;
> +
> +     ASSERT(!(lock_mode & (XFS_ILOCK_PARENT | XFS_ILOCK_RTBITMAP |
> +                           XFS_ILOCK_RTSUM)));
> +
>       if (lock_mode & (XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)) {
> -             ASSERT(subclass + XFS_LOCK_INUMORDER <
> -                     (1 << (XFS_MMAPLOCK_SHIFT - XFS_IOLOCK_SHIFT)));
> -             lock_mode |= (subclass + XFS_LOCK_INUMORDER) << 
> XFS_IOLOCK_SHIFT;
> +             ASSERT(subclass <= XFS_IOLOCK_MAX_SUBCLASS);
> +             ASSERT(subclass + XFS_IOLOCK_PARENT_VAL <
> +                                             MAX_LOCKDEP_SUBCLASSES);
> +             class += subclass << XFS_IOLOCK_SHIFT;
> +             if (lock_mode & XFS_IOLOCK_PARENT)
> +                     class += XFS_IOLOCK_PARENT_VAL << XFS_IOLOCK_SHIFT;

Looks like you can use XFS_IOLOCK_PARENT here.

>       }
>  
>       if (lock_mode & (XFS_MMAPLOCK_SHARED|XFS_MMAPLOCK_EXCL)) {
> -             ASSERT(subclass + XFS_LOCK_INUMORDER <
> -                     (1 << (XFS_ILOCK_SHIFT - XFS_MMAPLOCK_SHIFT)));
> -             lock_mode |= (subclass + XFS_LOCK_INUMORDER) <<
> -                                                     XFS_MMAPLOCK_SHIFT;
> +             ASSERT(subclass <= XFS_MMAPLOCK_MAX_SUBCLASS);
> +             class += subclass << XFS_MMAPLOCK_SHIFT;
>       }
>  
> -     if (lock_mode & (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL))
> -             lock_mode |= (subclass + XFS_LOCK_INUMORDER) << XFS_ILOCK_SHIFT;
> +     if (lock_mode & (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)) {
> +             ASSERT(subclass <= XFS_ILOCK_MAX_SUBCLASS);
> +             class += subclass << XFS_ILOCK_SHIFT;
> +     }
>  
> -     return lock_mode;
> +     return (lock_mode & ~XFS_LOCK_SUBCLASS_MASK) | class;
>  }
>  
>  /*
> @@ -399,6 +407,11 @@ xfs_lock_inumorder(int lock_mode, int subclass)
>   * transaction (such as truncate). This can result in deadlock since the long
>   * running trans might need to wait for the inode we just locked in order to
>   * push the tail and free space in the log.
> + *
> + * xfs_lock_inodes() can only be used to lock one type of lock at a time -
> + * the iolock, the mmaplock or the ilock, but not more than one at a time. 
> If we
> + * lock more than one at a time, lockdep will report false positives saying 
> we
> + * have violated locking orders.
>   */
>  void
>  xfs_lock_inodes(
> @@ -409,8 +422,29 @@ xfs_lock_inodes(
>       int             attempts = 0, i, j, try_lock;
>       xfs_log_item_t  *lp;
>  
> -     /* currently supports between 2 and 5 inodes */
> +     /*
> +      * Currently supports between 2 and 5 inodes with exclusive locking.  We
> +      * support an arbitrary depth of locking here, but absolute limits on
> +      * inodes depend on the the type of locking and the limits placed by
> +      * lockdep annotations in xfs_lock_inumorder.  These are all checked by
> +      * the asserts.
> +      */
>       ASSERT(ips && inodes >= 2 && inodes <= 5);
> +     ASSERT(lock_mode & (XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL |
> +                         XFS_ILOCK_EXCL));
> +     ASSERT(!(lock_mode & (XFS_IOLOCK_SHARED | XFS_MMAPLOCK_SHARED |
> +                           XFS_ILOCK_SHARED)));
> +     ASSERT(!(lock_mode & XFS_IOLOCK_EXCL) ||
> +             inodes <= XFS_IOLOCK_MAX_SUBCLASS + 1);
> +     ASSERT(!(lock_mode & XFS_MMAPLOCK_EXCL) ||
> +             inodes <= XFS_MMAPLOCK_MAX_SUBCLASS + 1);
> +     ASSERT(!(lock_mode & XFS_ILOCK_EXCL) ||
> +             inodes <= XFS_ILOCK_MAX_SUBCLASS + 1);
> +
> +     if (lock_mode & XFS_IOLOCK_EXCL) {
> +             ASSERT(!(lock_mode & (XFS_MMAPLOCK_EXCL | XFS_ILOCK_EXCL)));
> +     } else if (lock_mode & XFS_MMAPLOCK_EXCL)
> +             ASSERT(!(lock_mode & XFS_ILOCK_EXCL));
>  
>       try_lock = 0;
>       i = 0;
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index 8f22d20..ca9e119 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -284,9 +284,9 @@ static inline int xfs_isiflocked(struct xfs_inode *ip)
>   * Flags for lockdep annotations.
>   *
>   * XFS_LOCK_PARENT - for directory operations that require locking a
> - * parent directory inode and a child entry inode.  The parent gets locked
> - * with this flag so it gets a lockdep subclass of 1 and the child entry
> - * lock will have a lockdep subclass of 0.
> + * parent directory inode and a child entry inode. IOLOCK requires nesting,
> + * MMAPLOCK does not support this class, ILOCK requires a single subclass
> + * to differentiate parent from child.
>   *
>   * XFS_LOCK_RTBITMAP/XFS_LOCK_RTSUM - the realtime device bitmap and summary
>   * inodes do not participate in the normal lock order, and thus have their
> @@ -295,30 +295,63 @@ static inline int xfs_isiflocked(struct xfs_inode *ip)
>   * XFS_LOCK_INUMORDER - for locking several inodes at the some time
>   * with xfs_lock_inodes().  This flag is used as the starting subclass
>   * and each subsequent lock acquired will increment the subclass by one.
> - * So the first lock acquired will have a lockdep subclass of 4, the
> - * second lock will have a lockdep subclass of 5, and so on. It is
> - * the responsibility of the class builder to shift this to the correct
> - * portion of the lock_mode lockdep mask.
> + * However, MAX_LOCKDEP_SUBCLASSES == 8, which means we are greatly
> + * limited to the subclasses we can represent via nesting. We need at least
> + * 5 inodes nest depth for the ILOCK through rename, and we also have to 
> support
> + * XFS_ILOCK_PARENT, which gives 6 subclasses. Then we have 
> XFS_ILOCK_RTBITMAP
> + * and XFS_ILOCK_RTSUM, which are another 2 unique subclasses, so that's all
> + * 8 subclasses supported by lockdep.
> + *
> + * This also means we have to number the sub-classes in the lowest bits of
> + * the mask we keep, and we have to ensure we never exceed 3 bits of lockdep
> + * mask and we can't use bit-masking to build the subclasses. What a mess.
> + *
> + * Bit layout:
> + *
> + * Bit               Lock Region
> + * 16-19     XFS_IOLOCK_SHIFT dependencies
> + * 20-23     XFS_MMAPLOCK_SHIFT dependencies
> + * 24-31     XFS_ILOCK_SHIFT dependencies
> + *
> + * IOLOCK values
> + *
> + * 0-3               subclass value
> + * 4-7               PARENT subclass values
> + *
> + * MMAPLOCK values
> + *
> + * 0-3               subclass value
> + * 4-7               unused
> + *
> + * ILOCK values
> + * 0-4               subclass values
> + * 5         PARENT subclass (not nestable)
> + * 6         RTBITMAP subclass (not nestable)
> + * 7         RTSUM subclass (not nestable)
> + * 

Trailing whitespace on the line above.

>   */
> -#define XFS_LOCK_PARENT              1
> -#define XFS_LOCK_RTBITMAP    2
> -#define XFS_LOCK_RTSUM               3
> -#define XFS_LOCK_INUMORDER   4
> -
> -#define XFS_IOLOCK_SHIFT     16
> -#define      XFS_IOLOCK_PARENT       (XFS_LOCK_PARENT << XFS_IOLOCK_SHIFT)
> -
> -#define XFS_MMAPLOCK_SHIFT   20
> -
> -#define XFS_ILOCK_SHIFT              24
> -#define      XFS_ILOCK_PARENT        (XFS_LOCK_PARENT << XFS_ILOCK_SHIFT)
> -#define      XFS_ILOCK_RTBITMAP      (XFS_LOCK_RTBITMAP << XFS_ILOCK_SHIFT)
> -#define      XFS_ILOCK_RTSUM         (XFS_LOCK_RTSUM << XFS_ILOCK_SHIFT)
> -
> -#define XFS_IOLOCK_DEP_MASK  0x000f0000
> -#define XFS_MMAPLOCK_DEP_MASK        0x00f00000
> -#define XFS_ILOCK_DEP_MASK   0xff000000
> -#define XFS_LOCK_DEP_MASK    (XFS_IOLOCK_DEP_MASK | \
> +#define XFS_IOLOCK_SHIFT             16
> +#define XFS_IOLOCK_PARENT_VAL                4
> +#define XFS_IOLOCK_MAX_SUBCLASS              (XFS_IOLOCK_PARENT_VAL - 1)
> +#define XFS_IOLOCK_DEP_MASK          0x000f0000
> +#define      XFS_IOLOCK_PARENT               (XFS_IOLOCK_PARENT_VAL << 
> XFS_IOLOCK_SHIFT)
> +
> +#define XFS_MMAPLOCK_SHIFT           20
> +#define XFS_MMAPLOCK_NUMORDER                0
> +#define XFS_MMAPLOCK_MAX_SUBCLASS    3
> +#define XFS_MMAPLOCK_DEP_MASK                0x00f00000
> +
> +#define XFS_ILOCK_SHIFT                      24
> +#define XFS_ILOCK_PARENT_VAL         5
> +#define XFS_ILOCK_MAX_SUBCLASS               (XFS_ILOCK_PARENT_VAL - 1)
> +#define XFS_ILOCK_RTBITMAP_VAL               6
> +#define XFS_ILOCK_RTSUM_VAL          7
> +#define XFS_ILOCK_DEP_MASK           0xff000000
> +#define      XFS_ILOCK_PARENT                (XFS_ILOCK_PARENT_VAL << 
> XFS_ILOCK_SHIFT)
> +#define      XFS_ILOCK_RTBITMAP              (XFS_ILOCK_RTBITMAP_VAL << 
> XFS_ILOCK_SHIFT)
> +#define      XFS_ILOCK_RTSUM                 (XFS_ILOCK_RTSUM_VAL << 
> XFS_ILOCK_SHIFT)
> +

While we're at it, the spacing before some of the macro names looks
inconsistent (as it was before). It looks like some use a space and
others use a tab.

With those fixes:

Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx>

> +#define XFS_LOCK_SUBCLASS_MASK       (XFS_IOLOCK_DEP_MASK | \
>                                XFS_MMAPLOCK_DEP_MASK | \
>                                XFS_ILOCK_DEP_MASK)
>  
> -- 
> 2.5.0
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

<Prev in Thread] Current Thread [Next in Thread>