<div dir="ltr"><div><div>Hi Brian and Eric,<br><br></div>I have applied your kernel path v2(add unlink trace) to <a href="http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm" target="_blank">kernel-2.6.32-279.19.1.el6.x86_64.rpm</a> in out test cluster and start testing again.<br>
</div>Any progress will let you known. Thank you.<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">2013/4/19 Brian Foster <span dir="ltr"><<a href="mailto:bfoster@redhat.com" target="_blank">bfoster@redhat.com</a>></span><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On 04/18/2013 11:23 AM, ·ûÓÀÌΠwrote:<br>
> Hi Brian and Eric,<br>
</div><div class="im">> The shutdown is not easy to produce but finally right now 2 of our<br>
> servers in our test cluster xfs was shutdown.<br>
><br>
<br>
</div>Understood. We've been trying very hard to reproduce ourselves to make<br>
it easier to debug, but haven't been able to reproduce at all so far.<br>
This process allows us to make _some_ progress on the issue, even if it<br>
is slower going than we'd like... ;)<br>
<div class="im"><br>
> the trace output as following<br>
> <a href="https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing" target="_blank">https://docs.google.com/file/d/0B7n2C4T5tfNCLXRYUWJ0b19JcWc/edit?usp=sharing</a><br>
><br>
<br>
</div>Thanks again for the data. The racing behavior Eric called out (nice<br>
catch!) in his last mail lit up some light bulbs internally with regard<br>
to some old locking issues triggered by XFS in the 6.3 kernel. The<br>
following bug serves as an example:<br>
<br>
<a href="https://bugzilla.redhat.com/show_bug.cgi?id=852847" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=852847</a><br>
<br>
... the fix for which went into the 2.6.32-279.19.1 6.3.z release. Could<br>
you move some or all of your servers to this kernel[1] and see how it<br>
goes? The best case is it resolves the problem, worst case we carry on<br>
debugging from there...<br>
<br>
Brian<br>
<br>
[1] -<br>
<a href="http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm" target="_blank">http://mirror.linux.duke.edu/pub/centos/6.3/updates/x86_64/Packages/kernel-2.6.32-279.19.1.el6.x86_64.rpm</a><br>

<div class="im"><br>
> Sorry but the systemtap is interrupt and I didn't noticed that so I<br>
> didn't get systemtap logs.<br>
><br>
> /var/log/message is same as before<br>
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()<br>
> returned error 22.<br>
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_inactive: xfs_ifree returned<br>
> error 22<br>
> Apr 18 22:43:14 10 kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called<br>
> from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address =<br>
> 0xffffffffa02d44aa<br>
> Apr 18 22:43:14 10 kernel: XFS (sdb): I/O Error Detected. Shutting down<br>
> filesystem<br>
> Apr 18 22:43:14 10 kernel: XFS (sdb): Please umount the filesystem and<br>
> rectify the problem(s)<br>
> Apr 18 22:43:20 10 kernel: XFS (sdb): xfs_log_force: error 5 returned.<br>
><br>
> The metadump file is large I'll share it to you soon.<br>
><br>
><br>
</div>> 2013/4/18 Brian Foster <<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a> <mailto:<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a>>><br>
<div><div class="h5">><br>
>     On 04/18/2013 04:25 AM, ·ûÓÀÌΠwrote:<br>
>     > Hi Brian and Eric,<br>
>     > Can I change as following to bypass it?<br>
><br>
>     This is probably not a wise thing to do. The problem we're seeing here<br>
>     is indicative of a potentially larger problem than this particular error<br>
>     path. An inode is being unlinked and inactivated, but we aren't finding<br>
>     on the list where we expect it to be. Killing the error return doesn't<br>
>     eliminate the larger problem.<br>
><br>
>     So while changes could end up being made in this area as part of a fix,<br>
>     I would not suggest making any changes beyond those designed to help<br>
>     debug until we have a better idea of root cause.<br>
><br>
>     Brian<br>
><br>
>     > --- a/xfs_inode.c<br>
>     > +++ b/xfs_inode.c<br>
>     > @@ -1764,7 +1764,7 @@ xfs_iunlink_remove(<br>
>     >                  */<br>
>     >                 next_agino =<br>
>     be32_to_cpu(agi->agi_unlinked[bucket_index]);<br>
>     >                 last_ibp = NULL;<br>
>     > -               while (next_agino != agino) {<br>
>     > +               while (next_agino != agino && next_agino !=<br>
>     NULLAGINO) {<br>
>     >                         /*<br>
>     >                          * If the last inode wasn't the one<br>
>     pointing to<br>
>     >                          * us, then release its buffer since we're not<br>
>     > @@ -1786,6 +1786,14 @@ xfs_iunlink_remove(<br>
>     >                         ASSERT(next_agino != NULLAGINO);<br>
>     >                         ASSERT(next_agino != 0);<br>
>     >                 }<br>
>     > +               if (next_agino == NULLAGINO) {<br>
>     > +                       /*<br>
>     > +                        *After search the list for the inode<br>
>     being free<br>
>     > +                        *we still can't find it.<br>
>     > +                        */<br>
>     > +                       xfs_err(mp, "%s ino %lld not found in unlinked<br>
>     > list.",<br>
>     > +                                    __func__, (unsigned long<br>
>     > long)ip->i_ino);<br>
>     > +               }<br>
>     >                 /*<br>
>     >                  * Now last_ibp points to the buffer previous to us on<br>
>     >                  * the unlinked list.  Pull us from the list.<br>
>     > @@ -1810,16 +1818,20 @@ xfs_iunlink_remove(<br>
>     >                 } else {<br>
>     >                         xfs_trans_brelse(tp, ibp);<br>
>     >                 }<br>
>     > -               /*<br>
>     > -                * Point the previous inode on the list to the<br>
>     next inode.<br>
>     > -                */<br>
>     > -               last_dip->di_next_unlinked = cpu_to_be32(next_agino);<br>
>     > -               ASSERT(next_agino != 0);<br>
>     > -               offset = last_offset + offsetof(xfs_dinode_t,<br>
>     > di_next_unlinked);<br>
>     > -               xfs_trans_inode_buf(tp, last_ibp);<br>
>     > -               xfs_trans_log_buf(tp, last_ibp, offset,<br>
>     > -                                 (offset + sizeof(xfs_agino_t) - 1));<br>
>     > -               xfs_inobp_check(mp, last_ibp);<br>
>     > +               if (next_agino != NULLAGINO) {<br>
>     > +                       /*<br>
>     > +                       * If only find the inode being free then<br>
>     we modify<br>
>     > +                       * the unlinked list.<br>
>     > +                       * Point the previous inode on the list to the<br>
>     > next inode.<br>
>     > +                       */<br>
>     > +                       last_dip->di_next_unlinked =<br>
>     > cpu_to_be32(next_agino);<br>
>     > +                       ASSERT(next_agino != 0);<br>
>     > +                       offset = last_offset + offsetof(xfs_dinode_t,<br>
>     > di_next_unlinked);<br>
>     > +                       xfs_trans_inode_buf(tp, last_ibp);<br>
>     > +                       xfs_trans_log_buf(tp, last_ibp, offset,<br>
>     > +                                         (offset +<br>
>     sizeof(xfs_agino_t)<br>
>     > - 1));<br>
>     > +                       xfs_inobp_check(mp, last_ibp);<br>
>     > +               }<br>
>     >         }<br>
>     >         return 0;<br>
>     >  }<br>
>     ><br>
>     > Thank you.<br>
>     ><br>
>     ><br>
>     > 2013/4/18 ·ûÓÀÌΠ<<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a> <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>><br>
</div></div>>     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a> <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>>>><br>
<div class="im">>     ><br>
>     >     Hi Brain and Eric,<br>
>     >     If the problem is the agno can't be found in the unlinked<br>
>     list. Can<br>
>     >     we just bypass it instead of passing ino=0xffffffff to<br>
>     xfs_inotobp?<br>
>     >     Thank you.<br>
>     ><br>
>     ><br>
>     >     2013/4/18 ·ûÓÀÌΠ<<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a><br>
</div>>     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>> <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a><br>
<div><div class="h5">>     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>>>><br>
>     ><br>
>     >         Hi Eric,<br>
>     >         The shutdown issue is still not reproduced yet. But I get the<br>
>     >         following error today during test.<br>
>     ><br>
>     >         Apr 18 07:42:51 10 kernel: Call Trace:<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?<br>
>     >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]<br>
>     >         schedule_timeout+0x215/0x2e0<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?<br>
>     >         kmem_zone_alloc+0x77/0xf0 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>]<br>
>     __down+0x72/0xb0<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?<br>
>     >         _xfs_buf_find+0x102/0x280 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: "echo 0 ><br>
>     >         /proc/sys/kernel/hung_task_timeout_secs" disables this<br>
>     message.<br>
>     >         Apr 18 07:42:51 10 kernel: glusterfsd    D ffffffff8160b3c0<br>
>     >         0 14522      1 0x00000083<br>
>     >         Apr 18 07:42:51 10 kernel: ffff882015a63a28 0000000000000082<br>
>     >         0000000000000000 0000000000000000<br>
>     >         Apr 18 07:42:51 10 kernel: ffff882015a639b8 ffffffffa02d91ef<br>
>     >         ffff882015a639d8 0000000000000246<br>
>     >         Apr 18 07:42:51 10 kernel: ffff880e70491af8 ffff882015a63fd8<br>
>     >         000000000000fb88 ffff880e70491af8<br>
>     >         Apr 18 07:42:51 10 kernel: Call Trace:<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ?<br>
>     >         xfs_buf_cond_lock+0x2f/0xc0 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff814fe6a5>]<br>
>     >         schedule_timeout+0x215/0x2e0<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d5f07>] ?<br>
>     >         kmem_zone_alloc+0x77/0xf0 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff814ff5c2>]<br>
>     __down+0x72/0xb0<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>] ?<br>
>     >         _xfs_buf_find+0x102/0x280 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff81097ef1>] down+0x41/0x50<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da493>]<br>
>     >         xfs_buf_lock+0x53/0x110 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da652>]<br>
>     >         _xfs_buf_find+0x102/0x280 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02da83b>]<br>
>     >         xfs_buf_get+0x6b/0x1a0 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02daeac>]<br>
>     >         xfs_buf_read+0x2c/0x100 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d0af8>]<br>
>     >         xfs_trans_read_buf+0x1f8/0x400 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b3444>]<br>
>     >         xfs_read_agi+0x74/0x100 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02b967b>]<br>
>     >         xfs_iunlink+0x5b/0x180 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff810724c7>] ?<br>
>     >         current_fs_time+0x27/0x30<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d12a7>] ?<br>
>     >         xfs_trans_ichgtime+0x27/0xa0 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d15fb>]<br>
>     >         xfs_droplink+0x5b/0x70 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02d2f9e>]<br>
>     >         xfs_remove+0x27e/0x3a0 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff81186fd3>] ?<br>
>     >         generic_permission+0x23/0xb0<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffffa02e0968>]<br>
>     >         xfs_vn_unlink+0x48/0x90 [xfs]<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff81188c0f>]<br>
>     vfs_unlink+0x9f/0xe0<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118795a>] ?<br>
>     >         lookup_hash+0x3a/0x50<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b143>]<br>
>     >         do_unlinkat+0x183/0x1c0<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff81017938>] ?<br>
>     >         syscall_trace_enter+0x1d8/0x1e0<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff8118b196>]<br>
>     sys_unlink+0x16/0x20<br>
>     >         Apr 18 07:42:51 10 kernel: [<ffffffff8100b308>]<br>
>     tracesys+0xd9/0xde<br>
>     ><br>
>     >         Thank you.<br>
>     ><br>
>     ><br>
>     >         2013/4/17 Eric Sandeen <<a href="mailto:sandeen@sandeen.net">sandeen@sandeen.net</a><br>
>     <mailto:<a href="mailto:sandeen@sandeen.net">sandeen@sandeen.net</a>><br>
</div></div>>     >         <mailto:<a href="mailto:sandeen@sandeen.net">sandeen@sandeen.net</a> <mailto:<a href="mailto:sandeen@sandeen.net">sandeen@sandeen.net</a>>>><br>
<div class="im">>     ><br>
>     >             On Apr 16, 2013, at 8:48 PM, ·ûÓÀÌÎ<br>
>     <<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a> <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>><br>
</div>>     >             <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a><br>
<div><div class="h5">>     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>>>> wrote:<br>
>     ><br>
>     >>             Hi Brain,<br>
>     >>             Can I change as following?<br>
>     ><br>
>     >             ASSERTS are no-ops in a non-debug kernel, so this won't<br>
>     >             change any behavior.  I hope we'll know more if we get new<br>
>     >             traces from your patched kernel....<br>
>     ><br>
>     >             Eric<br>
>     ><br>
>     >>             --- a/xfs_inode.c<br>
>     >>             +++ b/xfs_inode.c<br>
>     >>             @@ -1773,6 +1773,8 @@ xfs_iunlink_remove(<br>
>     >>                                     if (last_ibp != NULL) {<br>
>     >>                                             xfs_trans_brelse(tp,<br>
>     >>             last_ibp);<br>
>     >>                                     }<br>
>     >>             +                        ASSERT(next_agino != NULLAGINO);<br>
>     >>             +                        ASSERT(next_agino != 0);<br>
>     >>                                     next_ino = XFS_AGINO_TO_INO(mp,<br>
>     >>             agno, next_agino);<br>
>     >>                                     error = xfs_inotobp(mp, tp,<br>
>     >>             next_ino, &last_dip,<br>
>     >>                                                         &last_ibp,<br>
>     >>             &last_offset, 0);<br>
>     >>             @@ -1783,8 +1785,6 @@ xfs_iunlink_remove(<br>
>     >>                                             return error;<br>
>     >>                                     }<br>
>     >>                                     next_agino =<br>
>     >>             be32_to_cpu(last_dip->di_next_unlinked);<br>
>     >>             -                       ASSERT(next_agino != NULLAGINO);<br>
>     >>             -                       ASSERT(next_agino != 0);<br>
>     >>                             }<br>
>     >>                             /*<br>
>     >>                              * Now last_ibp points to the buffer<br>
>     >>             previous to us on<br>
>     >><br>
>     >>             Thank you.<br>
>     >><br>
>     >><br>
>     >>             2013/4/17 ·ûÓÀÌΠ<<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a><br>
>     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>><br>
</div></div>>     >>             <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a><br>
<div><div class="h5">>     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>>>><br>
>     >><br>
>     >>                 Hi Brain,<br>
>     >>                 If it is because NULLAGINO is passed in  to<br>
>     xfs_inotobp().<br>
>     >>                 Can I move the following two lines before<br>
>     xfs_inotobp?<br>
>     >><br>
>     >>                 For example:<br>
>     >><br>
>     >>                 1767                 while (next_agino != agino) {<br>
>     >>                 1768                         /*<br>
>     >>                 1769                          * If the last inode<br>
>     >>                 wasn't the one pointing to<br>
>     >>                 1770                          * us, then release its<br>
>     >>                 buffer since we're not<br>
>     >>                 1771                          * going to do anything<br>
>     >>                 with it.<br>
>     >>                 1772                          */<br>
>     >>                 1773                         if (last_ibp != NULL) {<br>
>     >>                 1774<br>
>     >>                 xfs_trans_brelse(tp, last_ibp);<br>
>     >>                 1775                         }<br>
>     >>                 1776                         next_ino =<br>
>     >>                 XFS_AGINO_TO_INO(mp, agno, next_agino);<br>
>     >>                 +                               ASSERT(next_agino !=<br>
>     >>                 NULLAGINO);<br>
>     >>                 +                               ASSERT(next_agino<br>
>     != 0);<br>
>     >>                 1777                         error = xfs_inotobp(mp,<br>
>     >>                 tp, next_ino, &last_dip,<br>
>     >>                 1778<br>
>     >>                 &last_ibp, &last_offset, 0);<br>
>     >>                 1779                         if (error) {<br>
>     >>                 1780                                 xfs_warn(mp,<br>
>     >>                 1781                                         "%s:<br>
>     >>                 xfs_inotobp() returned error %d.",<br>
>     >>                 1782<br>
>     __func__,<br>
>     >>                 error);<br>
>     >>                 1783                                 return error;<br>
>     >>                 1784                         }<br>
>     >>                 1785                         next_agino =<br>
>     >>                 be32_to_cpu(last_dip->di_next_unlinked);<br>
>     >>                 -<br>
>     //ASSERT(next_agino !=<br>
>     >>                 NULLAGINO);<br>
>     >>                 -<br>
>     //ASSERT(next_agino != 0);<br>
>     >>                 1788                 }<br>
>     >>                 I don't understand xfs well and correct me if I'm<br>
>     >>                 totally wrong.<br>
>     >>                 Thank you very much.<br>
>     >><br>
>     >><br>
>     >>                 2013/4/17 ·ûÓÀÌΠ<<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a><br>
>     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>><br>
</div></div>>     >>                 <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a><br>
<div class="im">>     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>>>><br>
>     >><br>
>     >>                     Hi Brain,<br>
>     >>                     I want to ask a question, according to the<br>
>     >>                     shutdown trace. The ino in  xfs_iunlink_remove<br>
>     >>                     is 0x113, why xfs_imap got ino=0xffffffff ?<br>
>     >><br>
>     >>                     --- xfs_imap --<br>
>     >><br>
>     module("xfs").function("xfs_imap@fs/xfs/xfs_ialloc.c:1257").return<br>
>     >>                     -- return=0x16<br>
>     >>                     vars: mp=0xffff882017a50800 tp=0xffff881c81797c70<br>
>     >>                     ino=0xffffffff<br>
>     >><br>
>     >>                     --- xfs_iunlink_remove --<br>
>     >><br>
>     module("xfs").function("xfs_iunlink_remove@fs/xfs/xfs_inode.c:1680").return<br>
>     >>                     -- return=0x16<br>
>     >>                     vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00<br>
>     >>                     next_ino=? mp=? agi=? dip=?<br>
>     >>                     agibp=0xffff880109b47e20 ibp=? agno=? agino=?<br>
>     >>                     next_agino=? last_ibp=?<br>
>     >>                     last_dip=0xffff882000000000 bucket_index=?<br>
>     >>                     offset=? last_offset=0xffffffffffff8810 error=?<br>
>     >>                     __func__=[...]<br>
>     >>                     ip: i_ino = 0x113, i_flags = 0x0<br>
>     >><br>
>     >>                     Thank you.<br>
>     >><br>
>     >><br>
>     >><br>
>     >>                     2013/4/17 ·ûÓÀÌΠ<<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a><br>
>     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>><br>
</div>>     >>                     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a><br>
<div class="im">>     <mailto:<a href="mailto:yongtaofu@gmail.com">yongtaofu@gmail.com</a>>>><br>
>     >><br>
>     >>                         Hi Brain,<br>
>     >>                         Thank you for your update, and I have applied<br>
>     >>                         your last kernel patch. However it is not<br>
>     easy<br>
>     >>                         to reproduce especially in out test<br>
>     >>                         environment. Till now is not happens again.<br>
>     >>                         I'll update the kernel patch now. BTW is<br>
>     there<br>
>     >>                         any findings in the logs of previous thread?<br>
>     >><br>
>     <a href="http://oss.sgi.com/archives/xfs/2013-04/msg00327.html" target="_blank">http://oss.sgi.com/archives/xfs/2013-04/msg00327.html</a><br>
>     >>                         I guess it tend to happen during glusterfs<br>
>     >>                         rebalance because glusterfs moves a lot of<br>
>     >>                         file from one server to another and then<br>
>     >>                         unlink it.<br>
>     >><br>
>     >>                         Thank you.<br>
>     >><br>
>     >><br>
>     >>                         2013/4/17 Brian Foster<br>
>     <<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a> <mailto:<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a>><br>
</div>>     >>                         <mailto:<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a><br>
<div><div class="h5">>     <mailto:<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a>>>><br>
>     >><br>
>     >>                             On 04/16/2013 12:24 PM, Dave Chinner<br>
>     wrote:<br>
>     >>                             > On Mon, Apr 15, 2013 at 07:14:39PM<br>
>     >>                             -0400, Brian Foster wrote:<br>
>     >>                             >> Hi,<br>
>     >>                             >><br>
>     >>                             >> Thanks for the data in the<br>
>     previous thread:<br>
>     >>                             >><br>
>     >>                             >><br>
>     >><br>
>     <a href="http://oss.sgi.com/archives/xfs/2013-04/msg00327.html" target="_blank">http://oss.sgi.com/archives/xfs/2013-04/msg00327.html</a><br>
>     >>                             >><br>
>     >>                             ...<br>
>     >>                             >><br>
>     >>                             >>      echo 1 ><br>
>     >><br>
>     /sys/kernel/debug/tracing/events/xfs/xfs_iunlink/enable<br>
>     >>                             >>      echo 1 ><br>
>     >><br>
>     /sys/kernel/debug/tracing/events/xfs/xfs_iunlink_remove/enable<br>
>     >>                             >>      ... reproduce ...<br>
>     >>                             >>      cat<br>
>     >>                             /sys/kernel/debug/tracing/trace ><br>
>     trace.output<br>
>     >>                             ><br>
>     >>                             > It's better to use trace-cmd for this.<br>
>     >>                             it will result in less<br>
>     >>                             > dropped events. i.e.:<br>
>     >>                             ><br>
>     >>                             >       $ trace-cmd record -e<br>
>     xfs_iunlink\*<br>
>     >>                             >       ... reproduce ...<br>
>     >>                             >       ^C<br>
>     >>                             >       $ trace-cmd report > trace.output<br>
>     >>                             ><br>
>     >>                             >> --- a/fs/xfs/linux-2.6/xfs_trace.h<br>
>     >>                             >> +++ b/fs/xfs/linux-2.6/xfs_trace.h<br>
>     >>                             >> @@ -581,6 +581,8 @@<br>
>     >>                             DEFINE_INODE_EVENT(xfs_file_fsync);<br>
>     >>                             ...<br>
>     >>                             ><br>
>     >>                             > I would suggest that the the tracing<br>
>     >>                             shoul dbe at entry of the<br>
>     >>                             > function, otherwise we won't get a<br>
>     >>                             tracepoint for the operation that<br>
>     >>                             > triggers the shutdown. (That's the<br>
>     >>                             reason most tracepoints in XFS<br>
>     >>                             > are at function entry...)<br>
>     >>                             ><br>
>     >><br>
>     >>                             Good points, thanks Dave. A v2 that pulls<br>
>     >>                             up the tracepoints towards<br>
>     >>                             function entry is appended.<br>
>     >><br>
>     >>                             Brian<br>
>     >><br>
>     >>                             From<br>
>     >>                             280943e78ebe0b97a774cba51e7815c42f044b55<br>
>     >>                             Mon Sep 17 00:00:00 2001<br>
>     >>                             From: Brian Foster<br>
>     <<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a> <mailto:<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a>><br>
</div></div>>     >>                             <mailto:<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a><br>
<div><div class="h5">>     <mailto:<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a>>>><br>
>     >>                             Date: Mon, 15 Apr 2013 18:16:24 -0400<br>
>     >>                             Subject: [PATCH v2] xfs: add tracepoints<br>
>     >>                             for xfs_iunlink and<br>
>     >>                             xfs_iunlink_remove<br>
>     >><br>
>     >>                             ---<br>
>     >>                              fs/xfs/linux-2.6/xfs_trace.h |    2 ++<br>
>     >>                              fs/xfs/xfs_inode.c           |    4 ++++<br>
>     >>                              2 files changed, 6 insertions(+), 0<br>
>     >>                             deletions(-)<br>
>     >><br>
>     >>                             diff --git a/fs/xfs/linux-2.6/xfs_trace.h<br>
>     >>                             b/fs/xfs/linux-2.6/xfs_trace.h<br>
>     >>                             index adc6ec4..338a0f9 100644<br>
>     >>                             --- a/fs/xfs/linux-2.6/xfs_trace.h<br>
>     >>                             +++ b/fs/xfs/linux-2.6/xfs_trace.h<br>
>     >>                             @@ -583,6 +583,8 @@<br>
>     >>                             DEFINE_INODE_EVENT(xfs_file_fsync);<br>
>     >>                              DEFINE_INODE_EVENT(xfs_destroy_inode);<br>
>     >>                              DEFINE_INODE_EVENT(xfs_dirty_inode);<br>
>     >>                              DEFINE_INODE_EVENT(xfs_clear_inode);<br>
>     >>                             +DEFINE_INODE_EVENT(xfs_iunlink);<br>
>     >>                             +DEFINE_INODE_EVENT(xfs_iunlink_remove);<br>
>     >><br>
>     >>                              DEFINE_INODE_EVENT(xfs_dquot_dqalloc);<br>
>     >>                              DEFINE_INODE_EVENT(xfs_dquot_dqdetach);<br>
>     >>                             diff --git a/fs/xfs/xfs_inode.c<br>
>     >>                             b/fs/xfs/xfs_inode.c<br>
>     >>                             index 19900f0..d705c77 100644<br>
>     >>                             --- a/fs/xfs/xfs_inode.c<br>
>     >>                             +++ b/fs/xfs/xfs_inode.c<br>
>     >>                             @@ -1615,6 +1615,8 @@ xfs_iunlink(<br>
>     >><br>
>     >>                                     mp = tp->t_mountp;<br>
>     >><br>
>     >>                             +       trace_xfs_iunlink(ip);<br>
>     >>                             +<br>
>     >>                                     /*<br>
>     >>                                      * Get the agi buffer first.  It<br>
>     >>                             ensures lock ordering<br>
>     >>                                      * on the list.<br>
>     >>                             @@ -1694,6 +1696,8 @@ xfs_iunlink_remove(<br>
>     >>                                     mp = tp->t_mountp;<br>
>     >>                                     agno = XFS_INO_TO_AGNO(mp,<br>
>     ip->i_ino);<br>
>     >><br>
>     >>                             +       trace_xfs_iunlink_remove(ip);<br>
>     >>                             +<br>
>     >>                                     /*<br>
>     >>                                      * Get the agi buffer first.  It<br>
>     >>                             ensures lock ordering<br>
>     >>                                      * on the list.<br>
>     >>                             --<br>
>     >>                             1.7.7.6<br>
>     >><br>
>     >><br>
>     >><br>
>     >><br>
>     >>                         --<br>
>     >>                         ·ûÓÀÌÎ<br>
>     >><br>
>     >><br>
>     >><br>
>     >><br>
>     >>                     --<br>
>     >>                     ·ûÓÀÌÎ<br>
>     >><br>
>     >><br>
>     >><br>
>     >><br>
>     >>                 --<br>
>     >>                 ·ûÓÀÌÎ<br>
>     >><br>
>     >><br>
>     >><br>
>     >><br>
>     >>             --<br>
>     >>             ·ûÓÀÌÎ<br>
>     >>             _______________________________________________<br>
>     >>             xfs mailing list<br>
>     >>             <a href="mailto:xfs@oss.sgi.com">xfs@oss.sgi.com</a> <mailto:<a href="mailto:xfs@oss.sgi.com">xfs@oss.sgi.com</a>><br>
</div></div>>     <mailto:<a href="mailto:xfs@oss.sgi.com">xfs@oss.sgi.com</a> <mailto:<a href="mailto:xfs@oss.sgi.com">xfs@oss.sgi.com</a>>><br>
<div class="HOEnZb"><div class="h5">>     >>             <a href="http://oss.sgi.com/mailman/listinfo/xfs" target="_blank">http://oss.sgi.com/mailman/listinfo/xfs</a><br>
>     ><br>
>     ><br>
>     ><br>
>     ><br>
>     >         --<br>
>     >         ·ûÓÀÌÎ<br>
>     ><br>
>     ><br>
>     ><br>
>     ><br>
>     >     --<br>
>     >     ·ûÓÀÌÎ<br>
>     ><br>
>     ><br>
>     ><br>
>     ><br>
>     > --<br>
>     > ·ûÓÀÌÎ<br>
><br>
><br>
><br>
><br>
> --<br>
> ·ûÓÀÌÎ<br>
<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>·ûÓÀÌÎ
</div>