xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging

符永涛 yongtaofu at gmail.com
Thu Apr 18 10:23:16 CDT 2013


Hi Brian and Eric,
The shutdown is not easy to produce but finally right now 2 of our servers
in our test cluster xfs was shutdown.

the trace output as following
https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing

Sorry but the systemtap is interrupt and I didn't noticed that so I didn't
get systemtap logs.

/var/log/message is same as before
Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
returned error 22.
Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
error 22
Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =
0xffffffffa02d44aa
Apr 18 22:43:14 10 kernel: XFS (sdb): I/O Error Detected. Shutting down
filesystem
Apr 18 22:43:14 10 kernel: XFS (sdb): Please umount the filesystem and
rectify the problem(s)
Apr 18 22:43:20 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.

The metadump file is large I'll share it to you soon.


2013/4/18 Brian Foster <bfoster at redhat.com>

> On 04/18/2013 04:25 AM, 符永涛 wrote:
> > Hi Brian and Eric,
> > Can I change as following to bypass it?
>
> This is probably not a wise thing to do. The problem we're seeing here
> is indicative of a potentially larger problem than this particular error
> path. An inode is being unlinked and inactivated, but we aren't finding
> on the list where we expect it to be. Killing the error return doesn't
> eliminate the larger problem.
>
> So while changes could end up being made in this area as part of a fix,
> I would not suggest making any changes beyond those designed to help
> debug until we have a better idea of root cause.
>
> Brian
>
> > --- a/xfs_inode.c
> > +++ b/xfs_inode.c
> > @@ -1764,7 +1764,7 @@ xfs_iunlink_remove(
> >                  */
> >                 next_agino =
> be32_to_cpu(agi->agi_unlinked[bucket_index]);
> >                 last_ibp = NULL;
> > -               while (next_agino != agino) {
> > +               while (next_agino != agino && next_agino != NULLAGINO) {
> >                         /*
> >                          * If the last inode wasn't the one pointing to
> >                          * us, then release its buffer since we're not
> > @@ -1786,6 +1786,14 @@ xfs_iunlink_remove(
> >                         ASSERT(next_agino != NULLAGINO);
> >                         ASSERT(next_agino != 0);
> >                 }
> > +               if (next_agino == NULLAGINO) {
> > +                       /*
> > +                        *After search the list for the inode being free
> > +                        *we still can't find it.
> > +                        */
> > +                       xfs_err(mp, "%s ino %lld not found in unlinked
> > list.",
> > +                                    __func__, (unsigned long
> > long)ip->i_ino);
> > +               }
> >                 /*
> >                  * Now last_ibp points to the buffer previous to us on
> >                  * the unlinked list.  Pull us from the list.
> > @@ -1810,16 +1818,20 @@ xfs_iunlink_remove(
> >                 } else {
> >                         xfs_trans_brelse(tp, ibp);
> >                 }
> > -               /*
> > -                * Point the previous inode on the list to the next
> inode.
> > -                */
> > -               last_dip->di_next_unlinked = cpu_to_be32(next_agino);
> > -               ASSERT(next_agino != 0);
> > -               offset = last_offset + offsetof(xfs_dinode_t,
> > di_next_unlinked);
> > -               xfs_trans_inode_buf(tp, last_ibp);
> > -               xfs_trans_log_buf(tp, last_ibp, offset,
> > -                                 (offset + sizeof(xfs_agino_t) - 1));
> > -               xfs_inobp_check(mp, last_ibp);
> > +               if (next_agino != NULLAGINO) {
> > +                       /*
> > +                       * If only find the inode being free then we
> modify
> > +                       * the unlinked list.
> > +                       * Point the previous inode on the list to the
> > next inode.
> > +                       */
> > +                       last_dip->di_next_unlinked =
> > cpu_to_be32(next_agino);
> > +                       ASSERT(next_agino != 0);
> > +                       offset = last_offset + offsetof(xfs_dinode_t,
> > di_next_unlinked);
> > +                       xfs_trans_inode_buf(tp, last_ibp);
> > +                       xfs_trans_log_buf(tp, last_ibp, offset,
> > +                                         (offset + sizeof(xfs_agino_t)
> > - 1));
> > +                       xfs_inobp_check(mp, last_ibp);
> > +               }
> >         }
> >         return 0;
> >  }
> >
> > Thank you.
> >
> >
> > 2013/4/18 符永涛 <yongtaofu at gmail.com <mailto:yongtaofu at gmail.com>>
> >
> >     Hi Brain and Eric,
> >     If the problem is the agno can't be found in the unlinked list. Can
> >     we just bypass it instead of passing ino=0xffffffff to xfs_inotobp?
> >     Thank you.
> >
> >
> >     2013/4/18 符永涛 <yongtaofu at gmail.com <mailto:yongtaofu at gmail.com>>
> >
> >         Hi Eric,
> >         The shutdown issue is still not reproduced yet. But I get the
> >         following error today during test.
> >
> >         Apr 18 07:42:51 10 kernel: Call Trace:
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
> >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
> >         schedule_timeout+0x215/0x2e0
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
> >         kmem_zone_alloc+0x77/0xf0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
> >         _xfs_buf_find+0x102/0x280 [xfs]
> >         Apr 18 07:42:51 10 kernel: "echo 0 >
> >         /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >         Apr 18 07:42:51 10 kernel: glusterfsd    D ffffffff8160b3c0
> >         0 14522      1 0x00000083
> >         Apr 18 07:42:51 10 kernel: ffff882015a63a28 0000000000000082
> >         0000000000000000 0000000000000000
> >         Apr 18 07:42:51 10 kernel: ffff882015a639b8 ffffffffa02d91ef
> >         ffff882015a639d8 0000000000000246
> >         Apr 18 07:42:51 10 kernel: ffff880e70491af8 ffff882015a63fd8
> >         000000000000fb88 ffff880e70491af8
> >         Apr 18 07:42:51 10 kernel: Call Trace:
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?
> >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]
> >         schedule_timeout+0x215/0x2e0
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?
> >         kmem_zone_alloc+0x77/0xf0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>] __down+0x72/0xb0
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?
> >         _xfs_buf_find+0x102/0x280 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>] down+0x41/0x50
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>]
> >         xfs_buf_lock+0x53/0x110 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>]
> >         _xfs_buf_find+0x102/0x280 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>]
> >         xfs_buf_get+0x6b/0x1a0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>]
> >         xfs_buf_read+0x2c/0x100 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]
> >         xfs_trans_read_buf+0x1f8/0x400 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>]
> >         xfs_read_agi+0x74/0x100 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>]
> >         xfs_iunlink+0x5b/0x180 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ?
> >         current_fs_time+0x27/0x30
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?
> >         xfs_trans_ichgtime+0x27/0xa0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>]
> >         xfs_droplink+0x5b/0x70 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>]
> >         xfs_remove+0x27e/0x3a0 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?
> >         generic_permission+0x23/0xb0
> >         Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>]
> >         xfs_vn_unlink+0x48/0x90 [xfs]
> >         Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>]
> vfs_unlink+0x9f/0xe0
> >         Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ?
> >         lookup_hash+0x3a/0x50
> >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>]
> >         do_unlinkat+0x183/0x1c0
> >         Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?
> >         syscall_trace_enter+0x1d8/0x1e0
> >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>]
> sys_unlink+0x16/0x20
> >         Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>]
> tracesys+0xd9/0xde
> >
> >         Thank you.
> >
> >
> >         2013/4/17 Eric Sandeen <sandeen at sandeen.net
> >         <mailto:sandeen at sandeen.net>>
> >
> >             On Apr 16, 2013, at 8:48 PM, 符永涛 <yongtaofu at gmail.com
> >             <mailto:yongtaofu at gmail.com>> wrote:
> >
> >>             Hi Brain,
> >>             Can I change as following?
> >
> >             ASSERTS are no-ops in a non-debug kernel, so this won't
> >             change any behavior.  I hope we'll know more if we get new
> >             traces from your patched kernel....
> >
> >             Eric
> >
> >>             --- a/xfs_inode.c
> >>             +++ b/xfs_inode.c
> >>             @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(
> >>                                     if (last_ibp != NULL) {
> >>                                             xfs_trans_brelse(tp,
> >>             last_ibp);
> >>                                     }
> >>             +                        ASSERT(next_agino != NULLAGINO);
> >>             +                        ASSERT(next_agino != 0);
> >>                                     next_ino = XFS_AGINO_TO_INO(mp,
> >>             agno, next_agino);
> >>                                     error = xfs_inotobp(mp, tp,
> >>             next_ino, &last_dip,
> >>                                                         &last_ibp,
> >>             &last_offset, 0);
> >>             @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(
> >>                                             return error;
> >>                                     }
> >>                                     next_agino =
> >>             be32_to_cpu(last_dip->di_next_unlinked);
> >>             -                       ASSERT(next_agino != NULLAGINO);
> >>             -                       ASSERT(next_agino != 0);
> >>                             }
> >>                             /*
> >>                              * Now last_ibp points to the buffer
> >>             previous to us on
> >>
> >>             Thank you.
> >>
> >>
> >>             2013/4/17 符永涛 <yongtaofu at gmail.com
> >>             <mailto:yongtaofu at gmail.com>>
> >>
> >>                 Hi Brain,
> >>                 If it is because NULLAGINO is passed in  to
> xfs_inotobp().
> >>                 Can I move the following two lines before xfs_inotobp?
> >>
> >>                 For example:
> >>
> >>                 1767                 while (next_agino != agino) {
> >>                 1768                         /*
> >>                 1769                          * If the last inode
> >>                 wasn't the one pointing to
> >>                 1770                          * us, then release its
> >>                 buffer since we're not
> >>                 1771                          * going to do anything
> >>                 with it.
> >>                 1772                          */
> >>                 1773                         if (last_ibp != NULL) {
> >>                 1774
> >>                 xfs_trans_brelse(tp, last_ibp);
> >>                 1775                         }
> >>                 1776                         next_ino =
> >>                 XFS_AGINO_TO_INO(mp, agno, next_agino);
> >>                 +                               ASSERT(next_agino !=
> >>                 NULLAGINO);
> >>                 +                               ASSERT(next_agino != 0);
> >>                 1777                         error = xfs_inotobp(mp,
> >>                 tp, next_ino, &last_dip,
> >>                 1778
> >>                 &last_ibp, &last_offset, 0);
> >>                 1779                         if (error) {
> >>                 1780                                 xfs_warn(mp,
> >>                 1781                                         "%s:
> >>                 xfs_inotobp() returned error %d.",
> >>                 1782                                         __func__,
> >>                 error);
> >>                 1783                                 return error;
> >>                 1784                         }
> >>                 1785                         next_agino =
> >>                 be32_to_cpu(last_dip->di_next_unlinked);
> >>                 -                               //ASSERT(next_agino !=
> >>                 NULLAGINO);
> >>                 -                               //ASSERT(next_agino !=
> 0);
> >>                 1788                 }
> >>                 I don't understand xfs well and correct me if I'm
> >>                 totally wrong.
> >>                 Thank you very much.
> >>
> >>
> >>                 2013/4/17 符永涛 <yongtaofu at gmail.com
> >>                 <mailto:yongtaofu at gmail.com>>
> >>
> >>                     Hi Brain,
> >>                     I want to ask a question, according to the
> >>                     shutdown trace. The ino in  xfs_iunlink_remove
> >>                     is 0x113, why xfs_imap got ino=0xffffffff ?
> >>
> >>                     --- xfs_imap --
> >>                     module("xfs").function("xfs_imap at fs
> /xfs/xfs_ialloc.c:1257").return
> >>                     -- return=0x16
> >>                     vars: mp=0xffff882017a50800 tp=0xffff881c81797c70
> >>                     ino=0xffffffff
> >>
> >>                     --- xfs_iunlink_remove --
> >>                     module("xfs").function("xfs_iunlink_remove at fs
> /xfs/xfs_inode.c:1680").return
> >>                     -- return=0x16
> >>                     vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00
> >>                     next_ino=? mp=? agi=? dip=?
> >>                     agibp=0xffff880109b47e20 ibp=? agno=? agino=?
> >>                     next_agino=? last_ibp=?
> >>                     last_dip=0xffff882000000000 bucket_index=?
> >>                     offset=? last_offset=0xffffffffffff8810 error=?
> >>                     __func__=[...]
> >>                     ip: i_ino = 0x113, i_flags = 0x0
> >>
> >>                     Thank you.
> >>
> >>
> >>
> >>                     2013/4/17 符永涛 <yongtaofu at gmail.com
> >>                     <mailto:yongtaofu at gmail.com>>
> >>
> >>                         Hi Brain,
> >>                         Thank you for your update, and I have applied
> >>                         your last kernel patch. However it is not easy
> >>                         to reproduce especially in out test
> >>                         environment. Till now is not happens again.
> >>                         I'll update the kernel patch now. BTW is there
> >>                         any findings in the logs of previous thread?
> >>
> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> >>                         I guess it tend to happen during glusterfs
> >>                         rebalance because glusterfs moves a lot of
> >>                         file from one server to another and then
> >>                         unlink it.
> >>
> >>                         Thank you.
> >>
> >>
> >>                         2013/4/17 Brian Foster <bfoster at redhat.com
> >>                         <mailto:bfoster at redhat.com>>
> >>
> >>                             On 04/16/2013 12:24 PM, Dave Chinner wrote:
> >>                             > On Mon, Apr 15, 2013 at 07:14:39PM
> >>                             -0400, Brian Foster wrote:
> >>                             >> Hi,
> >>                             >>
> >>                             >> Thanks for the data in the previous
> thread:
> >>                             >>
> >>                             >>
> >>
> http://oss.sgi.com/archives/xfs/2013-04/msg00327.html
> >>                             >>
> >>                             ...
> >>                             >>
> >>                             >>      echo 1 >
> >>
> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable
> >>                             >>      echo 1 >
> >>
> /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable
> >>                             >>      ... reproduce ...
> >>                             >>      cat
> >>                             /sys/kernel/debug/tracing/trace >
> trace.output
> >>                             >
> >>                             > It's better to use trace-cmd for this.
> >>                             it will result in less
> >>                             > dropped events. i.e.:
> >>                             >
> >>                             >       $ trace-cmd record -e xfs_iunlink\*
> >>                             >       ... reproduce ...
> >>                             >       ^C
> >>                             >       $ trace-cmd report > trace.output
> >>                             >
> >>                             >> --- a/fs/xfs/linux-2.6/xfs_trace.h
> >>                             >> +++ b/fs/xfs/linux-2.6/xfs_trace.h
> >>                             >> @@ -581,6 +581,8 @@
> >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
> >>                             ...
> >>                             >
> >>                             > I would suggest that the the tracing
> >>                             shoul dbe at entry of the
> >>                             > function, otherwise we won't get a
> >>                             tracepoint for the operation that
> >>                             > triggers the shutdown. (That's the
> >>                             reason most tracepoints in XFS
> >>                             > are at function entry...)
> >>                             >
> >>
> >>                             Good points, thanks Dave. A v2 that pulls
> >>                             up the tracepoints towards
> >>                             function entry is appended.
> >>
> >>                             Brian
> >>
> >>                             From
> >>                             280943e78ebe0b97a774cba51e7815c42f044b55
> >>                             Mon Sep 17 00:00:00 2001
> >>                             From: Brian Foster <bfoster at redhat.com
> >>                             <mailto:bfoster at redhat.com>>
> >>                             Date: Mon, 15 Apr 2013 18:16:24 -0400
> >>                             Subject: [PATCH v2] xfs: add tracepoints
> >>                             for xfs_iunlink and
> >>                             xfs_iunlink_remove
> >>
> >>                             ---
> >>                              fs/xfs/linux-2.6/xfs_trace.h |    2 ++
> >>                              fs/xfs/xfs_inode.c           |    4 ++++
> >>                              2 files changed, 6 insertions(+), 0
> >>                             deletions(-)
> >>
> >>                             diff --git a/fs/xfs/linux-2.6/xfs_trace.h
> >>                             b/fs/xfs/linux-2.6/xfs_trace.h
> >>                             index adc6ec4..338a0f9 100644
> >>                             --- a/fs/xfs/linux-2.6/xfs_trace.h
> >>                             +++ b/fs/xfs/linux-2.6/xfs_trace.h
> >>                             @@ -583,6 +583,8 @@
> >>                             DEFINE_INODE_EVENT(xfs_file_fsync);
> >>                              DEFINE_INODE_EVENT(xfs_destroy_inode);
> >>                              DEFINE_INODE_EVENT(xfs_dirty_inode);
> >>                              DEFINE_INODE_EVENT(xfs_clear_inode);
> >>                             +DEFINE_INODE_EVENT(xfs_iunlink);
> >>                             +DEFINE_INODE_EVENT(xfs_iunlink_remove);
> >>
> >>                              DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
> >>                              DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
> >>                             diff --git a/fs/xfs/xfs_inode.c
> >>                             b/fs/xfs/xfs_inode.c
> >>                             index 19900f0..d705c77 100644
> >>                             --- a/fs/xfs/xfs_inode.c
> >>                             +++ b/fs/xfs/xfs_inode.c
> >>                             @@ -1615,6 +1615,8 @@ xfs_iunlink(
> >>
> >>                                     mp = tp->t_mountp;
> >>
> >>                             +       trace_xfs_iunlink(ip);
> >>                             +
> >>                                     /*
> >>                                      * Get the agi buffer first.  It
> >>                             ensures lock ordering
> >>                                      * on the list.
> >>                             @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(
> >>                                     mp = tp->t_mountp;
> >>                                     agno = XFS_INO_TO_AGNO(mp,
> ip->i_ino);
> >>
> >>                             +       trace_xfs_iunlink_remove(ip);
> >>                             +
> >>                                     /*
> >>                                      * Get the agi buffer first.  It
> >>                             ensures lock ordering
> >>                                      * on the list.
> >>                             --
> >>                             1.7.7.6
> >>
> >>
> >>
> >>
> >>                         --
> >>                         符永涛
> >>
> >>
> >>
> >>
> >>                     --
> >>                     符永涛
> >>
> >>
> >>
> >>
> >>                 --
> >>                 符永涛
> >>
> >>
> >>
> >>
> >>             --
> >>             符永涛
> >>             _______________________________________________
> >>             xfs mailing list
> >>             xfs at oss.sgi.com <mailto:xfs at oss.sgi.com>
> >>             http://oss.sgi.com/mailman/listinfo/xfs
> >
> >
> >
> >
> >         --
> >         符永涛
> >
> >
> >
> >
> >     --
> >     符永涛
> >
> >
> >
> >
> > --
> > 符永涛
>
>


-- 
符永涛
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130418/2abfef5d/attachment-0001.html>


More information about the xfs mailing list