xfs
[Top] [All Lists]

XFS_WANT_CORRUPTED_GOTO followed by "Attempt to access beyond end of dev

To: xfs@xxxxxxxxxxx
Subject: XFS_WANT_CORRUPTED_GOTO followed by "Attempt to access beyond end of device"
From: Jack Brown <zidibik@xxxxxxxxx>
Date: Mon, 21 Jul 2008 05:21:52 -0700 (PDT)
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Message-ID; b=F6dOfPWdN/n7EVB0XEJIQdyNIXHzBGJ76O6SyxvqSgjD8OggyjhEZoWu5Uo3PuXz4QRzZpDnxMa//HrdlLDV6pnInk/kmphdAk2LRWvxZ8YBGGyvjinG/k8AKNuy9SLXZAv2UzitzpD+xx9qGcCBWB3pQYFe1Nsk0SGTGqg10kY=;
Sender: xfs-bounce@xxxxxxxxxxx
dear list,

i tried to give as much details as possible about an incident I had with xfs 
below. executive summary: A happily working xfs partition failed, and refuse to 
be repaired, after a series of setfacl executions.

here's the full story:

i'm running a gentoo machine with 2.6.20-xen-r6 kernel, xfsprogs-2.9.7. i have 
a ~300 gb xfs partition on software raid1, mounted to /var, in a server that is 
mainly used as a samba fileserver and a mail server, also being home to 3 hvm 
guests os (windows). guest oses live on a separate hard disk. all three disks 
are SATA and are connected to:

Intel Corporation 631xESB/632xESB/3100 Chipset SATA IDE Controller (rev 09)

anyway, i recently implemented acls for our samba users. after some recursive 
setfacl commands, my trying to rm -rf a ~20 gb directory failed. (Jul 20 
~01:00):

rm: FATAL: cannot ensure `<filename>' (returned to via ..) is safe: 
Input/output error

dmesg showed:
XFS internal error XFS_WANT_CORRUPTED_GOTO at line 4533 of file 
fs/xfs/xfs_bmap.c.
Filesystem "md5": XFS internal error xfs_trans_cancel at line 1138 of file 
fs/xfs/xfs_trans.c.
xfs_force_shutdown(md5,0x8) called from line 1139 of file fs/xfs/xfs_trans.c.  
Return address = 0xee2a9a28
Filesystem "md5": Corruption of in-memory data detected.  Shutting down 
filesystem: md5
Please umount the filesystem, and rectify the problem(s)

and that's when I looked at /var/log/messages (it logs dmesg output) I saw the 
first xfs-related error to be:

Jul 19 19:46:52 canavar XFS internal error XFS_WANT_CORRUPTED_GOTO at line 4533 
of file fs/xfs/xfs_bmap.c.  Caller 0xee27d8c6 < stack addresses removed >

which probably started when I was executing setfacl commands. after rm failing, 
xfs stopped all operations. /var being rendered read-only, i cold rebooted.

upon reboot, I saw i/o errors related to incorrect partition size, as such:

Jul 20 02:23:38 canavar attempt to access beyond end of device
Jul 20 02:23:38 canavar md5: rw=0, want=405021982988992520, limit=603529728

and kernel refused to mount the filesystem. xfs_repair refused to run as well 
because the file system log was yet to be replayed. I then ran xfs_repair -L 
which started repairing things but hung in the middle of repairs in phase 3. I 
ctrl+c'd my way out of it and was at least able to mount my partition. most of 
the data seemed undamaged, but same things happen when i try to rm -rf that 
directory (it's officially haunted :)). after copying some important stuff out, 
i retried the xfs_repair, this time it hung around phase 4. (for this run, i 
have the repairlog attached).

# ps uax | grep xfs_repair
root      2022  0.4 57.1 653488 599572 pts/0   Sl   12:32   0:03 xfs_repair -L 
/dev/md5 -v

I had taken xfs_metadump sometime between, when it was hanging in phase 3. it's 
a bzipped 67 mb file. if somebody can provide a place to upload it, i can send 
it over.

If anyone will be interested in further data i can provide it, i'm keeping the 
cripped file system in case it may be of any use to you.


best regards,
jack


      

Attachment: repairlog.scrubbed
Description: Binary data

<Prev in Thread] Current Thread [Next in Thread>
  • XFS_WANT_CORRUPTED_GOTO followed by "Attempt to access beyond end of device", Jack Brown <=