On Tue, Apr 02, 2013 at 11:44:15AM -0700, L Ox wrote:
> We have a new Linux/XFS deployment (about a month old) and randomly without
> warning the XFS filesystem will go off-line. We are running Scientific
> Linux release 5.9 with the latest updates.
> # uname -a
> Linux node24 2.6.18-348.3.1.el5 #1 SMP Mon Mar 11 15:43:13 EDT 2013 x86_64
> x86_64 x86_64 GNU/Linux
> # cat /etc/redhat-release
> Scientific Linux release 5.9 (Boron)
> Here are the errors we see in /var/log/messages after the initial off-line
> -- snip --
> Apr 2 07:50:28 node24 kernel: xfs_iunlink_remove: xfs_inotobp() returned
> an error 22 on dm-6. Returning error.
> Apr 2 07:50:28 node24 kernel: xfs_inactive: xfs_ifree() returned an error
> = 22 on dm-6
#define EINVAL 22 /* Invalid argument */
That tends to imply a corrupt inode number in the unlinked list
> Here are the messages after I umount/xfs_repair/mount the filesystem:
What did xfs_repair detect/fix?
> # xfs_repair /dev/mapper/vol_d24-root
> Phase 1 - find and verify superblock...
> Phase 6 - check inode connectivity...
> - resetting contents of realtime bitmap and summary inodes
> - traversing filesystem ...
> - traversal finished ...
> - moving disconnected inodes to lost+found ...
> disconnected inode 202102936036, moving to lost+found
> disconnected inode 215350040250, moving to lost+found
> disconnected inode 215350208634, moving to lost+found
> disconnected inode 271016406074, moving to lost+found
Some inodes that had been unlinked from the directory structure
but not freed. They were probably on an unlinked inode list that
couldn't be walked.
> Any ideas?
If the problem is a one off, there isn't anything that can be done.
If you can reproduce it, try to narrow it down to the simplest case