[Top] [All Lists]

Re: XFS-filesystem corrupted by defragmentation

To: Bernhard Gschaider <bgschaid_lists@xxxxxxxxx>
Subject: Re: XFS-filesystem corrupted by defragmentation
From: Robert Brockway <robert@xxxxxxxxxxxxxxxxx>
Date: Tue, 13 Apr 2010 10:58:22 -0400 (EDT)
Cc: xfs@xxxxxxxxxxx
In-reply-to: <87r5mjpn8l.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
References: <87r5mjpn8l.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
User-agent: Alpine 1.10 (DEB 962 2008-03-14)
On Tue, 13 Apr 2010, Bernhard Gschaider wrote:

xfs_db -r /dev/mapper/VolGroup00-LogVol04
xfs_db: unexpected XFS SB magic number 0x00000000
xfs_db: read failed: Invalid argument
xfs_db: data size check failed
cache_node_purge: refcount was 1, not zero (node=0x2a25c20)
xfs_db: cannot read root inode (22)

Hi Bernhard.  Hmm that doesn't sound good.

The file-system is still mounted and working and I don't dare to do
anything about it (am in a mild state of panic) because I think it
might not come back if I do.

I think your choice to sit back and evaluate your options before acting is a wise one, especially since the filesystem is apparently mounted and functioning.

Depending on how worried you are there are various options available. Eg you could declare an emergency on the server and use xfs_freeze to freeze the filesystem while you take a backup. Note - I have never used xfs_freeze like this, it is just a suggestion. Naturally this will cause an outage and problems for users.

Alternatively you could use xfsdump to capture an incremental or full backup on the running system. (depending on whether you already have a level 0 xfs dump file or not). The developers have confirmed (on this list) that xfsdump will provide a consistent backup on a live filesystem.

Please note that any heavy I/O (like a backup) has the potential to cause problems on a sick filesystem. In my experience xfs is inclined to automatically remount read-only if it detects problems. While this can be catastrophic for running processes it is helpful in protecting data so I'm happy it works this way.

One last note. I hope you have good backups already. If you don't then this is the time to start taking good backups.

These are the notes from my backup talk:


I swear to god: I did not do anything else with the xfs_*-commands
than the stuff mentioned above

I defrag XFS filesystems from cron as recommended by SGI and I've never had a problem. Maybe defragmentation didn't cause the problem - maybe it just revealed an underlying problem.



Email: robert@xxxxxxxxxxxxxxxxx
IRC: Solver
Web: http://www.practicalsysadmin.com
Open Source: The revolution that silently changed the world

<Prev in Thread] Current Thread [Next in Thread>