CentOS 5.5 XFS internal errors (XFS_WANT_CORRUPTED_GOTO)
Shaun Adolphson
shaun at adolphson.net
Sun Jul 11 06:44:07 CDT 2010
On Thu, Jul 8, 2010 at 9:21 PM, Shaun Adolphson <shaun at adolphson.net> wrote:
> On Wed, Jul 7, 2010 at 9:18 AM, Dave Chinner <david at fromorbit.com> wrote:
>>
>> On Tue, Jul 06, 2010 at 08:57:45PM +1000, Shaun Adolphson wrote:
>> > Hi,
>> >
>> > We have been able to repeatably produce xfs internal errors
>> > (XFS_WANT_CORRUPTED_GOTO) on one of our fileservers. We are attempting
>> > to locally copy a 248Gig file off a usb drive formated as NTFS to the
>> > xfs drive. The copy gets about 96% of the way through and we get the
>> > following messages:
>> >
>> > Jun 28 22:14:46 terrorserver kernel: XFS internal error
>> > XFS_WANT_CORRUPTED_GOTO at line 2092 of file fs/xfs/xfs_bmap_btree.c.
>> > Caller 0xffffffff8837446f
>>
>> Interesting. That's a corrupted inode extent btree - I haven't seen
>> one of them for a long while. Were there any errors (like IO errors)
>> reported before this?
>>
>> However, the first step is to determine if the error is on disk or an
>> in-memory error. Can you post output of:
>>
>> - xfs_info <mntpt>
meta-data=/dev/TerrorVolume/terror isize=256 agcount=130385,
agsize=32768 blks
= sectsz=512 attr=1
data = bsize=4096 blocks=4272433152, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=2560, version=1
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
>> - xfs_repair -n after a shutdown
The out out of the xfs_repair -n is 6mb, below is the condensed
version. I can post the whole output if required.
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan (but don't clear) agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
.
.
.
- agno = 130384
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.
>>
>> Can you upgrade xfsprogs (i.e. xfs_repair) to the latest version
>> (3.1.2) before you do this as well?
# xfs_repair -V
xfs_repair version 3.1.2
>
> We have upgraded the xfsprogs to 3.1.2 and in the process of
> collecting the required infomation.
>
>>
>> > We have reproduced the condition 3 times and each time we have been
>> > able to remount the drive ( to replay the transaction log ) and then
>> > preform and xfs_repair.
>> >
>> > We are just using cp to copy the file.
>> >
>> > Some further details about the system:
>> >
>> > Software:
>> > - Fresh install of CentOS 5.5 64bit all patches up to date
>> > - Kernel 2.6.18-194.3.1.el5.centos.plus
>>
>> I've got no idea exactly what version of XFS that has in it, so I
>> can't say off the top of my head whether this is a fixed bug or not.
>>
>> Cheers,
>>
>> Dave.
>> --
>> Dave Chinner
>> david at fromorbit.com
>
>
>
> During other testing we have also been able to reproduce the issue by
> copying a self generated 248Gig file from another system disk to the
> XFS disk. The file was generated using dd with an input of /dev/zero.
>
> All the existing data (~6TB ) was successfully copied onto the storage
> with out have the error. The thing to note is that all the existing
> files are much smaller than the one that we are trying to copy in (
> 248Gig ). And since we have been having the shutdown we have copied
> many smaller files ( files < 30Gig in size ) onto the storage area
> with out issue
>
Thanks,
Shaun
More information about the xfs
mailing list