need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
符永涛
yongtaofu at gmail.com
Mon Apr 15 09:21:36 CDT 2013
Hi Eric,
I'm sorry for spaming.
And I got some more info and hope you're interested.
In glusterfs3.3
glusterfsd/src/glusterfsd.c line 1332 there's an unlink operation.
if (ctx->cmd_args.pid_file) {
unlink (ctx->cmd_args.pid_file);
ctx->cmd_args.pid_file = NULL;
}
Glusterfs try to unlink the rebalance pid file after complete and may be
this is where the issue happens.
See logs bellow:
1.
/var/log/secure indicates I start rebalance on Apr 15 11:58:11
Apr 15 11:58:11 10 sudo: root : TTY=pts/2 ; PWD=/root ; USER=root ;
COMMAND=/usr/sbin/gluster volume rebalance testbug start
2.
After xfs shutdown I got the following log:
--- xfs_iunlink_remove --
module("xfs").function("xfs_iunlink_remove at fs/xfs/xfs_inode.c:1680").return
-- return=0x16
vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
last_dip=0xffff882000000000 bucket_index=? offset=?
last_offset=0xffffffffffff8810 error=? __func__=[...]
ip: i_ino = 0x113, i_flags = 0x0
the inode is lead to xfs shutdown is
0x113
3.
I repair xfs and in lost+foud I find the inode:
[root at 10.23.72.93 lost+found]# pwd
/mnt/xfsd/lost+found
[root at 10.23.72.93 lost+found]# ls -l 275
---------T 1 root root 0 Apr 15 11:58 275
[root at 10.23.72.93 lost+found]# stat 275
File: `275'
Size: 0 Blocks: 0 IO Block: 4096 regular empty
file
Device: 810h/2064d Inode: 275 Links: 1
Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2013-04-15 11:58:25.833443445 +0800
Modify: 2013-04-15 11:58:25.912461256 +0800
Change: 2013-04-15 11:58:25.915442091 +0800
This file is created aroud 2013-04-15 11:58.
And the other files in lost+foud has extended attribute but this file
doesn't. Which means it is not part of glusterfs backend files. It should
be the rebalance pid file.
So may be unlink the rebalance pid file leads to xfs shutdown.
Thank you.
2013/4/15 Eric Sandeen <sandeen at sandeen.net>
> On 4/15/13 8:45 AM, 符永涛 wrote:
> > And at the same time we got the following error log of glusterfs:
> > [2013-04-15 20:43:03.851163] I
> [dht-rebalance.c:1611:gf_defrag_status_get] 0-glusterfs: Rebalance is
> completed
> > [2013-04-15 20:43:03.851248] I
> [dht-rebalance.c:1614:gf_defrag_status_get] 0-glusterfs: Files migrated:
> 1629, size: 1582329065954, lookups: 11036, failures: 561
> > [2013-04-15 20:43:03.887634] W [glusterfsd.c:831:cleanup_and_exit]
> (-->/lib64/libc.so.6(clone+0x6d) [0x3bd16e767d]
> (-->/lib64/libpthread.so.0() [0x3bd1a07851]
> (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405c9d]))) 0-:
> received signum (15), shutting down
> > [2013-04-15 20:43:03.887878] E
> [rpcsvc.c:1155:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not
> unregister with portmap
> >
>
> We'll take a look, thanks.
>
> Going forward, could I ask that you take a few minutes to batch up the
> information, rather than sending several emails in a row? It makes it much
> harder to collect the information when it's spread across so many emails.
>
> Thanks,
> -Eric
>
>
--
符永涛
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130415/fd8426ea/attachment-0001.html>
More information about the xfs
mailing list