need help how to debug xfs crash issue xfs_iunlink_remove: xfs_inotobp() returned error 22
符永涛
yongtaofu at gmail.com
Tue Apr 9 10:10:03 CDT 2013
Today 3 of our servers were impacted by the xfs shutdown. The logs are
identical.
2013/4/9 符永涛 <yongtaofu at gmail.com>
> before xfs force shutdown happens there seems no useful log in
> /var/log/messages
>
> Apr 9 10:38:08 cqdx smbd[4597]: Unable to connect to CUPS server
> localhost:631 - Connection refused
> Apr 9 10:38:08 cqdx smbd[3394]: [2013/04/09 10:38:08.944125, 0]
> printing/print_cups.c:468(cups_async_callback)
> Apr 9 10:38:08 cqdx smbd[3394]: failed to retrieve printer list:
> NT_STATUS_UNSUCCESSFUL
> Apr 9 10:51:09 cqdx smbd[5205]: [2013/04/09 10:51:09.723610, 0]
> printing/print_cups.c:109(cups_connect)
> Apr 9 10:51:09 cqdx smbd[5205]: Unable to connect to CUPS server
> localhost:631 - Connection refused
> Apr 9 10:51:09 cqdx smbd[3394]: [2013/04/09 10:51:09.724132, 0]
> printing/print_cups.c:468(cups_async_callback)
> Apr 9 10:51:09 cqdx smbd[3394]: failed to retrieve printer list:
> NT_STATUS_UNSUCCESSFUL
> Apr 9 11:01:30 cqdx kernel: XFS (sdb): xfs_iunlink_remove: xfs_inotobp()
> returned error 22.
> Apr 9 11:01:30 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree returned
> error 22
> Apr 9 11:01:30 cqdx kernel: XFS (sdb): xfs_do_force_shutdown(0x1) called
> from line 1184 of file fs/xfs/xfs_vnodeops.c. Return address =
> 0xffffffffa02ee20a
> Apr 9 11:01:30 cqdx kernel: XFS (sdb): I/O Error Detected. Shutting down
> filesystem
> Apr 9 11:01:30 cqdx kernel: XFS (sdb): Please umount the filesystem and
> rectify the problem(s)
> Apr 9 11:01:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 9 11:02:21 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 9 11:02:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 9 11:03:21 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 9 11:03:51 cqdx kernel: XFS (sdb): xfs_log_force: error 5 returned.
> Apr 9 11:03:57 cqdx init: tty (/dev/tty1) main process (3427) killed by
> TERM signal
> Apr 9 11:03:57 cqdx init: tty (/dev/tty2) main process (3429) killed by
> TERM signal
>
>
>
> 2013/4/9 Ben Myers <bpm at sgi.com>
>
>> Hey Yongtaofu,
>>
>> On Tue, Apr 09, 2013 at 09:05:32PM +0800, 符永涛 wrote:
>> > Also I want to know why all the server, all crash with the same crash
>> stack?
>> > Thank you, really need your help.
>>
>> What you've posted so far looks like evidence of a forced shutdown and
>> not a
>> crash. Is there a crash in addition to this forced shutdown? If so, can
>> you
>> post the stack for that too?
>>
>> >
>> > 2013/4/9, 符永涛 <yongtaofu at gmail.com>:
>> > > BTW
>> > > xfs_info /dev/sdb
>> > > meta-data=/dev/sdb isize=256 agcount=28,
>> agsize=268435440
>> > > blks
>> > > = sectsz=512 attr=2
>> > > data = bsize=4096 blocks=7324303360,
>> imaxpct=5
>> > > = sunit=16 swidth=160 blks
>> > > naming =version 2 bsize=4096 ascii-ci=0
>> > > log =internal bsize=4096 blocks=521728, version=2
>> > > = sectsz=512 sunit=16 blks,
>> lazy-count=1
>> > > realtime =none extsz=4096 blocks=0, rtextents=0
>> > >
>> > > 2013/4/9, 符永涛 <yongtaofu at gmail.com>:
>> > >> Dear xfs experts,
>> > >> I really need your help sincerely!!! In our production enviroment we
>> > >> run glusterfs over top of xfs on Dell x720D(Raid 6). And the xfs file
>> > >> system crash on some of the server frequently about every two weeks.
>> > >> Can you help to give me a direction about how to debug this issue and
>> > >> how to avoid it? Thank you very very much!
>> > >>
>> > >> uname -a
>> > >> Linux cqdx.miaoyan.cluster1.node11.qiyi.domain 2.6.32-279.el6.x86_64
>> > >> #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
>> > >>
>> > >> Every time the crash log is same, as following
>>
>> An initial guess is that somehow it is looking up a bad inode number,
>> e.g. it
>> is beyond the end of the filesystem and xfs_dilocate returns EINVAL.
>>
>> You could 'xfs_repair -n' to see what it finds (without modifying the
>> filesystem) as a first step.
>>
>> > >> 038 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_iunlink_remove:
>> > >> xfs_inotobp() returned error 22.
>>
>> Were there any lines of output before this? In some codebases there are
>> prints
>> in xfs_inotobp that would help show what happened.
>>
>> > >> 1039 Apr 9 09:41:36 cqdx kernel: XFS (sdb): xfs_inactive: xfs_ifree
>> > >> returned error 22
>> > >> 1040 Apr 9 09:41:36 cqdx kernel: XFS (sdb):
>> > >> xfs_do_force_shutdown(0x1) called from line 1184 of file
>> > >> fs/xfs/xfs_vnodeops.c. Return address = 0xffffffffa02ee20a
>> > >> 1041 Apr 9 09:41:36 cqdx kernel: XFS (sdb): I/O Error Detected.
>> > >> Shutting down filesystem
>> > >> 1042 Apr 9 09:41:36 cqdx kernel: XFS (sdb): Please umount the
>> > >> filesystem and rectify the problem(s)
>> > >> 1043 Apr 9 09:41:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > >> returned.
>> > >> 1044 Apr 9 09:42:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > >> returned.
>> > >> 1045 Apr 9 09:42:53 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > >> returned.
>> > >> 1046 Apr 9 09:43:23 cqdx kernel: XFS (sdb): xfs_log_force: error 5
>> > >> returned.
>>
>> The error 5 (EIO) look scary but they are due to the forced shutdown,
>> don't
>> worry about them.
>>
>> Thanks,
>> Ben
>>
>
>
>
> --
> 符永涛
>
--
符永涛
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.sgi.com/pipermail/xfs/attachments/20130409/f4d1a35f/attachment.html>
More information about the xfs
mailing list