xfs
[Top] [All Lists]

Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging

To: Brian Foster <bfoster@xxxxxxxxxx>
Subject: Re: xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 17 Apr 2013 02:24:17 +1000
Cc: yongtaofu@xxxxxxxxx, sandeen@xxxxxxxxxxx, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <516C89DF.4070904@xxxxxxxxxx>
References: <516C89DF.4070904@xxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Mon, Apr 15, 2013 at 07:14:39PM -0400, Brian Foster wrote:
> Hi,
> 
> Thanks for the data in the previous thread:
> 
> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> 
> I'm spinning off a new thread specifically for this because the original
> thread is already too large and scattered to track. As Eric stated,
> please try to keep data contained in as few messages as possible.
> 
> The data confirms Dave's theory where we are going off the end of the
> unlinked list when attempting to remove an inode, pass in NULLAGINO to
> xfs_inotobp() and the attempted conversion to a global inode number
> leads to EINVAL. The next question here is why wasn't the inode listed
> in the probe output on the unlinked inode list?
> 
> Unfortunately we're probably going to require to start making some
> debug-level changes to the kernel to make progress on this issue. If you
> are able to recompile a kernel and/or xfs module (which you referred to
> doing in the previous thread), could you start with the patch appended
> to this message[1] and collect the xfs_iunlink and xfs_iunlink_remove
> tracepoint data the next time the problem occurs? E.g.,
> 
>       echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
>       echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
>       ... reproduce ...
>       cat /sys/kernel/debug/tracing/trace > trace.output

It's better to use trace-cmd for this. it will result in less
dropped events. i.e.:

        $ trace-cmd record -e xfs_iunlink\*
        ... reproduce ...
        ^C
        $ trace-cmd report > trace.output

> --- a/fs/xfs/linux-2.6/xfs_trace.h
> +++ b/fs/xfs/linux-2.6/xfs_trace.h
> @@ -581,6 +581,8 @@ DEFINE_INODE_EVENT(xfs_file_fsync);
>  DEFINE_INODE_EVENT(xfs_destroy_inode);
>  DEFINE_INODE_EVENT(xfs_write_inode);
>  DEFINE_INODE_EVENT(xfs_clear_inode);
> +DEFINE_INODE_EVENT(xfs_iunlink);
> +DEFINE_INODE_EVENT(xfs_iunlink_remove);
> 
>  DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
>  DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 796edce..a43bec5 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1670,6 +1670,8 @@ xfs_iunlink(
>               (sizeof(xfs_agino_t) * bucket_index);
>       xfs_trans_log_buf(tp, agibp, offset,
>                         (offset + sizeof(xfs_agino_t) - 1));
> +
> +     trace_xfs_iunlink(ip);
>       return 0;
>  }
> 
> @@ -1820,6 +1822,8 @@ xfs_iunlink_remove(
>                                 (offset + sizeof(xfs_agino_t) - 1));
>               xfs_inobp_check(mp, last_ibp);
>       }
> +
> +     trace_xfs_iunlink_remove(ip);
>       return 0;

I would suggest that the the tracing shoul dbe at entry of the
function, otherwise we won't get a tracepoint for the operation that
triggers the shutdown. (That's the reason most tracepoints in XFS
are at function entry...)

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>