[Top] [All Lists]

Re: xfs_trans_read_buf error / xfs_force_shutdown with LVM snapshot and

To: xfs@xxxxxxxxxxx
Subject: Re: xfs_trans_read_buf error / xfs_force_shutdown with LVM snapshot and Xen kernel 2.6.18
From: Wolfram Schlich <lists@xxxxxxxxxxxxxxxxxxx>
Date: Thu, 18 Jun 2009 17:03:57 +0200
Dkim-signature: v=1; a=rsa-sha1; c=relaxed; d=bla.fasel.org; h=date:from :to:subject:message-id:references:mime-version:content-type: in-reply-to; s=mx; bh=q57CKIL0wewPZWqYEf6cqB4+IWU=; b=GO2kox8Fj7 3KGrg3z9OLWs0bxwhpUV3gi9hfRgDWiC8pEUH/j7Uhfc6JD+KwCcA7wTsZXqzeMg r6lMpSmxft08+ls4l34RXU+xgdriYpRiKqcqYbA2zLde9CKUMqL2mnQ+qt1cUb+k 2/yooMhIyHjLsI2E7yisjpJmIvxlPO8XA=
In-reply-to: <4A3A47AC.6070406@xxxxxxxxxxx>
Organization: Axis of Weasel(s)
References: <20090618065621.GD16867@xxxxxxxxxxxxx> <4A3A47AC.6070406@xxxxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
* Eric Sandeen <sandeen@xxxxxxxxxxx> [2009-06-18 16:09]:
> Wolfram Schlich wrote:
> > Hi!
> > 
> > I'm currently using LVM snapshots to create full system backups
> > of a bunch of Xen-based virtual machines (so-called domUs).
> > Those domUs all run Xen kernel 2.6.18 from the Xen 3.2.0 release
> > (32bit domU on 32bit dom0, I can post the .config if needed).
> > All domUs are using XFS on their LVM logical volumes.
> > The backup of all mounted snapshot volumes is made using
> > rsnapshot/rsync. This has been running smoothly for some
> > weeks now on 5 domUs.
> > 
> > Yesterday this happened during the backup on 1 domU:
> > --8<--
> > kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block 
> > 0x604d68       ("xfs_trans_read_buf") error 5 buf count 4096
> [...]
> > [...many more of such messages...]
> Well these are all I/O errors happening -to- xfs, so xfs is unlikely to
> be at fault here.  Any block layer messages before that?

Unfortunately not a single one :(

> > Is it possible that the LVM snapshot (that should be using
> > xfs_freeze/xfs_unfreeze) has created an inconsistent/damaged
> > snapshot that was kept from being repaired through norecovery?
> > Any other ideas?
> If it was a proper snapshot norecovery shouldn't matter, as the fs
> should be clean already (well, hopefully, 2.6.18 was a long time ago;
> this is true today, anyway)


> I suppose it's possible that the snapshot was not consistent, and you're
> hitting problems there, but things like:
> > kernel: I/O error in filesystem ("dm-21") meta-data dev dm-21 block
> 0xdd0       ("xfs_trans_read_buf") error 5 buf count 8192
> looks like a failure to read a perfectly normal block, not out of bounds
> or anything, so I'd most likely point to problems outside xfs.

I've now traced it back to LVM. It seems that the LVM snapshot
volume we were backing up at that time ran out of space and thus
was automatically removed (thus, the block device which the XFS
was on vanished).

Stupid LVM does not log ANYTHING when it just deletes a snapshot
running out of space :( I've now activated dmeventd which *does*
log such events *sigh*

Wolfram Schlich <wschlich@xxxxxxxxxx>
Gentoo Linux * http://dev.gentoo.org/~wschlich/

<Prev in Thread] Current Thread [Next in Thread>