xfs_iunlink_remove: xfs_inotobp() returned error 22 -- debugging
符永涛
yongtaofu at gmail.com
Fri Apr 19 23:03:10 CDT 2013
Hi Eric,
I will enable them and run test again. I can only reproduce it with
glusterfs rebalance. Glusterfs uses a mechanism it called syncop to unlink
file. For rebalance it uses
syncop_unlink(glusterfs/libglusterfs/src/syncop.c). In the glusterfs
sync_task framework(glusterfs/libglusterfs/src/syncop.c) it uses
"makecontext/swapcontext"<http://www.opengroup.org/onlinepubs/009695399/functions/makecontext.html>.
Does it leads to racing unlink from different CPU core?
Thank you.
2013/4/20 Eric Sandeen <sandeen at sandeen.net>
> On 4/19/13 7:51 PM, 符永涛 wrote:
> > After change mount option to sync shutdown still happens, and I got a
> trace again, the inode 0x1c57d is abnormal.
>
> since this is a race on namespace operations, I wouldn't have expected
> sync to matter.
>
> >
> https://docs.google.com/file/d/0B7n2C4T5tfNCYW1jNWhBbXBYakE/edit?usp=sharing
> > I have a question if the problem is hard to reproduce why I got 8 times
> in a week only in a test cluster with 8 node?
> > What's the problem?
>
> you must have something unique in your environment, and we don't know what
> it is.
>
> To gather more information, can you also turn on tracepoints for:
>
> xfs_rename
> xfs_create
> xfs_link
> xfs_remove
>
> in addition to xfs_iunlink and xfs_iunlink_remove,
> and we'll see what that tells us.
>
> There are many paths that manipulate the di_nlink count, and something is
> racing, but we don't yet know what two callchains they are.
>
> The above are all the callers that manipulate the link count, so they will
> yield more information about who is manipulating the counts.
>
> Thanks,
> -Eric
>
>
--
符永涛
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130420/34c25748/attachment.html>
More information about the xfs
mailing list