On Sat, May 03, 2014 at 12:49:33PM -0700, Marcel Giannelia wrote:
> I just had the following happen on a server that's never had any
> previous issues with XFS:
> I was copying an 8 GB file onto an XFS filesystem, when the copy
> aborted with an I/O error message and I was forced to unmount and
> xfs_repair the filesystem before it would mount again. Relevant dmesg
> messages below.
> Some other information that might be relevant:
> - Distribution & kernel version: Debian 7, uname -a returns:
> Linux hostname 3.2.0-4-686-pae #1 SMP Debian 3.2.41-2+deb7u2 i686 GNU/Linux
So, old hardware...
> dmesg entries:
> > Immediately after the cp command exited with "i/o error":
> XFS (md126): xfs_iflush_int: Bad inode 939480132, ptr 0xd12fa080, magic
> number 0x494d
The magic number has a single bit error in it.
#define XFS_DINODE_MAGIC 0x494e /* 'IN' */
That's the in-memory inode, not the on-disk inode. It caught the
problem before writing the bad magic number to disk - the in-memory
disk buffer was checked immediately before the in-memory copy, and
it checked out OK...
> After this, I ran xfs_repair with -L. xfs_repair noted the same bad inode
> number and deleted the file I had tried to copy, but otherwise made no changes
> that I could see. After this, the filesystem mounted normally and there were
> further issues.
What was the error that xfs_repair returned? There may have been
other things wrong with the inode that weren't caught when it was
loaded into memory.
However, I'd almost certainly be checking you hardware at this
point, as software doesn't usually cause random single bit flips...