xfs_trans_read_buf error / xfs_force_shutdown with LVM snapshot and Xen kernel 2.6.18
Wolfram Schlich
lists at wolfram.schlich.org
Thu Jun 18 10:03:57 CDT 2009
* Eric Sandeen <sandeen at sandeen.net> [2009-06-18 16:09]:
> Wolfram Schlich wrote:
> > Hi!
> >
> > I'm currently using LVM snapshots to create full system backups
> > of a bunch of Xen-based virtual machines (so-called domUs).
> > Those domUs all run Xen kernel 2.6.18 from the Xen 3.2.0 release
> > (32bit domU on 32bit dom0, I can post the .config if needed).
> > All domUs are using XFS on their LVM logical volumes.
> > The backup of all mounted snapshot volumes is made using
> > rsnapshot/rsync. This has been running smoothly for some
> > weeks now on 5 domUs.
> >
> > Yesterday this happened during the backup on 1 domU:
> > --8<--
> > kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 0x604d68 ("xfs_trans_read_buf") error 5 buf count 4096
> [...]
> > [...many more of such messages...]
>
> Well these are all I/O errors happening -to- xfs, so xfs is unlikely to
> be at fault here. Any block layer messages before that?
Unfortunately not a single one :(
> > Is it possible that the LVM snapshot (that should be using
> > xfs_freeze/xfs_unfreeze) has created an inconsistent/damaged
> > snapshot that was kept from being repaired through norecovery?
> > Any other ideas?
>
> If it was a proper snapshot norecovery shouldn't matter, as the fs
> should be clean already (well, hopefully, 2.6.18 was a long time ago;
> this is true today, anyway)
Ok.
> I suppose it's possible that the snapshot was not consistent, and you're
> hitting problems there, but things like:
>
> > kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block
> 0xdd0 ("xfs_trans_read_buf") error 5 buf count 8192
>
> looks like a failure to read a perfectly normal block, not out of bounds
> or anything, so I'd most likely point to problems outside xfs.
I've now traced it back to LVM. It seems that the LVM snapshot
volume we were backing up at that time ran out of space and thus
was automatically removed (thus, the block device which the XFS
was on vanished).
Stupid LVM does not log ANYTHING when it just deletes a snapshot
running out of space :( I've now activated dmeventd which *does*
log such events *sigh*
Thanks!
--
Regards,
Wolfram Schlich <wschlich at gentoo.org>
Gentoo Linux * http://dev.gentoo.org/~wschlich/
More information about the xfs
mailing list