xfs
[Top] [All Lists]

Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inoto

To: Ben Myers <bpm@xxxxxxx>
Subject: Re: need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
From: 符永涛 <yongtaofu@xxxxxxxxx>
Date: Tue, 9 Apr 2013 23:07:47 +0800
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=c+OjNO5SDYrOw+HbFVtNAr+Y2mfBwPgnls1D3fZz2xY=; b=yfDThMZaRGSKWXPWVZ0NvDkzspF+NAsKLVzDcRv1SZfTh7qDdnDWhaScdkrTEpyDKw Bh5t39DTn6ZbHW+tPDkm9pxlnx7TIVXYmUTGKiIMXipiwdokfwhJsefS/iXGDALmZKfk MtFCeUPIccb/KbNcW4zptVgmbNtj9LpcKrIbsnyxCDR4wBJsnkgOD21rugB9+rdw3N2Y QMzmu3v6JwnMeOA2o4NxtRXAwF6z+OmsPuedoMLlm3ox3LGT6dfMvRYFtC3cO502GhKY zFymJSjmCg84Pr5Qz45Aoa4Jl3NM2LHPihDAa+oE8q7YtluKWSt/1926kyFPlzkh3DGW Ed3w==
In-reply-to: <20130409145238.GE22182@xxxxxxx>
References: <CADFMGuJm5bPPwbbUtYwrCVDL23KExJTw_-VRX2UEEdZjo+i5oA@xxxxxxxxxxxxxx> <CADFMGu+=MM2yc=_peboV7JTNJ8F05TJfexmEErzcf0D8mAWFRg@xxxxxxxxxxxxxx> <CADFMGuKqkPbpcU=taqjTR4sA3o=w1LLAnKoEuj=OhJqEbQVijw@xxxxxxxxxxxxxx> <20130409145238.GE22182@xxxxxxx>
before xfs force shutdown happens there seems no useful log in /var/log/messages

Apr  9 10:38:08 cqdx smbd[4597]:   Unable to connect to CUPS server localhost:631 - Connection refused
Apr  9 10:38:08 cqdx smbd[3394]: [2013/04/09 10:38:08.944125,  0] printing/print_cups.c:468(cups_async_callback)
Apr  9 10:38:08 cqdx smbd[3394]:   failed to retrieve printer list: NT_STATUS_UNSUCCESSFUL
Apr  9 10:51:09 cqdx smbd[5205]: [2013/04/09 10:51:09.723610,  0] printing/print_cups.c:109(cups_connect)
Apr  9 10:51:09 cqdx smbd[5205]:   Unable to connect to CUPS server localhost:631 - Connection refused
Apr  9 10:51:09 cqdx smbd[3394]: [2013/04/09 10:51:09.724132,  0] printing/print_cups.c:468(cups_async_callback)
Apr  9 10:51:09 cqdx smbd[3394]:   failed to retrieve printer list: NT_STATUS_UNSUCCESSFUL
Apr  9 11:01:30 cqdx kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp() returned error 22.
Apr  9 11:01:30 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned error 22
Apr  9 11:01:30 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called from line 1184 of file fs/xfs/xfs_vnodeops.c.  Return address = 0xffffffffa02ee20a
Apr  9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting down filesystem
Apr  9 11:01:30 cqdx kernel: XFS (sdb): Please umount the filesystem and rectify the problem(s)
Apr  9 11:01:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr  9 11:02:21 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr  9 11:02:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr  9 11:03:21 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr  9 11:03:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
Apr  9 11:03:57 cqdx init: tty (/dev/tty1) main process (3427) killed by TERM signal
Apr  9 11:03:57 cqdx init: tty (/dev/tty2) main process (3429) killed by TERM signal



2013/4/9 Ben Myers <bpm@xxxxxxx>
Hey Yongtaofu,

On Tue, Apr 09, 2013 at 09:05:32PM +0800, 符永涛 wrote:
> Also I want to know why all the server, all crash with the same crash stack?
> Thank you, really need your help.

What you've posted so far looks like evidence of a forced shutdown and not a
crash.  Is there a crash in addition to this forced shutdown?  If so, can you
post the stack for that too?

>
> 2013/4/9, 符永涛 <yongtaofu@xxxxxxxxx>:
> > BTW
> > xfs_info /dev/sdb
> > meta-data=""               isize=256    agcount=28, agsize=268435440
> > blks
> >          =                       sectsz=512   attr=2
> > data     =                       bsize=4096   blocks=7324303360, imaxpct=5
> >          =                       sunit=16     swidth=160 blks
> > naming   =version 2              bsize=4096   ascii-ci=0
> > log      =internal               bsize=4096   blocks=521728, version=2
> >          =                       sectsz=512   sunit=16 blks, lazy-count=1
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> >
> > 2013/4/9, 符永涛 <yongtaofu@xxxxxxxxx>:
> >> Dear xfs experts,
> >> I really need your help sincerely!!! In our production enviroment we
> >> run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
> >> system crash on some of the server frequently about every two weeks.
> >> Can you help to give me a direction about how to debug this issue and
> >> how to avoid it? Thank you very very much!
> >>
> >> uname -a
> >> Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
> >> #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
> >>
> >> Every time the crash log is same, as following

An initial guess is that somehow it is looking up a bad inode number, e.g. it
is beyond the end of the filesystem and xfs_dilocate returns EINVAL.

You could 'xfs_repair -n' to see what it finds (without modifying the
filesystem) as a first step.

> >> 038 Apr  9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
> >> xfs_inotobp() returned error 22.

Were there any lines of output before this?  In some codebases there are prints
in xfs_inotobp that would help show what happened.

> >> 1039 Apr  9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
> >> returned error 22
> >> 1040 Apr  9 09:41:36 cqdx kernel: XFS (sdb):
> >> xfs_do_force_shutdown(0x1) called from line 1184 of file
> >> fs/xfs/xfs_vnodeops.c.  Return address = 0xffffffffa02ee20a
> >> 1041 Apr  9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
> >> Shutting down filesystem
> >> 1042 Apr  9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
> >> filesystem and rectify the problem(s)
> >> 1043 Apr  9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> >> returned.
> >> 1044 Apr  9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> >> returned.
> >> 1045 Apr  9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> >> returned.
> >> 1046 Apr  9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
> >> returned.

The error 5 (EIO) look scary but they are due to the forced shutdown, don't
worry about them.

Thanks,
        Ben



--
符永涛
<Prev in Thread] Current Thread [Next in Thread>