need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
符永涛
yongtaofu at gmail.com
Mon Apr 15 10:24:09 CDT 2013
Hi Brain,
Here's the meta_dump file:
https://docs.google.com/file/d/0B7n2C4T5tfNCRGpoUWIzaTlvM0E/edit?usp=sharing
Thank you.
2013/4/15 符永涛 <yongtaofu at gmail.com>
> Hi Eric,
> I'm sorry for spaming.
> And I got some more info and hope you're interested.
> In glusterfs3.3
> glusterfsd/src/glusterfsd.c line 1332 there's an unlink operation.
> if (ctx->cmd_args.pid_file) {
> unlink (ctx->cmd_args.pid_file);
> ctx->cmd_args.pid_file = NULL;
> }
> Glusterfs try to unlink the rebalance pid file after complete and may be
> this is where the issue happens.
> See logs bellow:
> 1.
> /var/log/secure indicates I start rebalance on Apr 15 11:58:11
> Apr 15 11:58:11 10 sudo: root : TTY=pts/2 ; PWD=/root ; USER=root ;
> COMMAND=/usr/sbin/gluster volume rebalance testbug start
> 2.
> After xfs shutdown I got the following log:
>
> --- xfs_iunlink_remove -- module("xfs").function("xfs_iunlink_remove at fs/xfs/xfs_inode.c:1680").return
> -- return=0x16
> vars: tp=0xffff881c81797c70 ip=0xffff881003c13c00 next_ino=? mp=? agi=?
> dip=? agibp=0xffff880109b47e20 ibp=? agno=? agino=? next_agino=? last_ibp=?
> last_dip=0xffff882000000000 bucket_index=? offset=?
> last_offset=0xffffffffffff8810 error=? __func__=[...]
> ip: i_ino = 0x113, i_flags = 0x0
> the inode is lead to xfs shutdown is
> 0x113
> 3.
> I repair xfs and in lost+foud I find the inode:
> [root at 10.23.72.93 lost+found]# pwd
> /mnt/xfsd/lost+found
> [root at 10.23.72.93 lost+found]# ls -l 275
> ---------T 1 root root 0 Apr 15 11:58 275
> [root at 10.23.72.93 lost+found]# stat 275
> File: `275'
> Size: 0 Blocks: 0 IO Block: 4096 regular empty
> file
> Device: 810h/2064d Inode: 275 Links: 1
> Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 2013-04-15 11:58:25.833443445 +0800
> Modify: 2013-04-15 11:58:25.912461256 +0800
> Change: 2013-04-15 11:58:25.915442091 +0800
> This file is created aroud 2013-04-15 11:58.
> And the other files in lost+foud has extended attribute but this file
> doesn't. Which means it is not part of glusterfs backend files. It should
> be the rebalance pid file.
>
> So may be unlink the rebalance pid file leads to xfs shutdown.
>
> Thank you.
>
>
>
> 2013/4/15 Eric Sandeen <sandeen at sandeen.net>
>
>> On 4/15/13 8:45 AM, 符永涛 wrote:
>> > And at the same time we got the following error log of glusterfs:
>> > [2013-04-15 20:43:03.851163] I
>> [dht-rebalance.c:1611:gf_defrag_status_get] 0-glusterfs: Rebalance is
>> completed
>> > [2013-04-15 20:43:03.851248] I
>> [dht-rebalance.c:1614:gf_defrag_status_get] 0-glusterfs: Files migrated:
>> 1629, size: 1582329065954, lookups: 11036, failures: 561
>> > [2013-04-15 20:43:03.887634] W [glusterfsd.c:831:cleanup_and_exit]
>> (-->/lib64/libc.so.6(clone+0x6d) [0x3bd16e767d]
>> (-->/lib64/libpthread.so.0() [0x3bd1a07851]
>> (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405c9d]))) 0-:
>> received signum (15), shutting down
>> > [2013-04-15 20:43:03.887878] E
>> [rpcsvc.c:1155:rpcsvc_program_unregister_portmap] 0-rpc-service: Could not
>> unregister with portmap
>> >
>>
>> We'll take a look, thanks.
>>
>> Going forward, could I ask that you take a few minutes to batch up the
>> information, rather than sending several emails in a row? It makes it much
>> harder to collect the information when it's spread across so many emails.
>>
>> Thanks,
>> -Eric
>>
>>
>
>
> --
> 符永涛
>
--
符永涛
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130415/1f488471/attachment.html>
More information about the xfs
mailing list