xfs
[Top] [All Lists]

Re: various disk probs...dump: structure needs cleaning (on home), on ro

To: "Linda A. Walsh" <xfs@xxxxxxxxx>
Subject: Re: various disk probs...dump: structure needs cleaning (on home), on root: invalid arguments (still)
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 30 Mar 2010 12:35:28 +1100
Cc: xfs-oss <xfs@xxxxxxxxxxx>
In-reply-to: <4BB146F3.3090806@xxxxxxxxx>
References: <4BB146F3.3090806@xxxxxxxxx>
User-agent: Mutt/1.5.20 (2009-06-14)
On Mon, Mar 29, 2010 at 05:33:55PM -0700, Linda A. Walsh wrote:
> This morning I got email from my dump job telling me it had a problem on my 
> home
> partition.
....

Linda, in future can you report a single issue per email? It's
extremely confusing reporting 3 or 4 issues in one email and then
jumping backwards and forwards between them randomly without clear
reference to what set of error messages you are refering to.....

> ----------------
> 
> Also, I _believe_ some of my stability problems have increased since I started
> trying to use "lvm" to maintain "snapshots" for enabling "previous versions" 
> under
> windows-clients with samba..

Make sure you mount snapshot images with "-o ro,norecovery" - you do
not want the snapshot being written to.

> Seems like with only a few snapshots performance goes way down as each 
> snapshot
> needs to have duplicate data.

No, the snapshots do not have duplicate data - LVM snapshots use
copy-on-write so that each snapshot is an overlay of "original" data
from the block device. To read from a given snapshot, all the
overlays from the current image of the block device to the snapstho
you are reading from need to be processed to determine where the
correct version of the data is. As you make more snapshots, there
are more overlays added to the snapshot ѕtack. Hence as you read
from a snapshot, the snapshot code has to process more overlays to
determine where to get the data from. Hence the more snapshots you
have, the slower they go....

> Same for this error that I still get (it ISN'T related to busy or locked 
> files --
> too many on non-busy disks). -- only happens with xfs_fsr (and never used to 
> before
> about 6 months ago)....
> 
> Note -- only happens on root which is relatively static:
> Mar 28 02:17:05 Ishtar fsr[19203]: / start inode=0
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=5982243: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=7584775: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=16784218: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=16784980: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=16789184: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=16792500: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=16792601: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=17181156: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=24019080: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=25978039: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=50337960: 
> Invalid argument
> Mar 28 02:17:05 Ishtar fsr[19203]: XFS_IOC_SWAPEXT failed: ino=50349588: 
> Invalid argument

It's some kind of file that the kernel is not allowing the extents
to be swapped on. Happens all the time - probably a file that has
been modified while defragmentation is being attempted.

> ---
> Unfortunately, I haven't been able to figure out how to run xfs_db on a live
> partition -- keeps telling me there is a log to be replayed and I should mount
> the partition...(BUT IT IS mounted!)...
> Sigh...

$ sudo xfs_db -r <device>

> xfsdump: dumping directories
> xfsdump: dumping non-directory files
> xfsdump: WARNING: could not open regular file ino 2148055983 mode 0x000081f8: 
> St
> ructure needs cleaning: not dumped

What is in dmesg? The filesystem shut down because of a
corruption - the kernel log will tell us why.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>