xfs
[Top] [All Lists]

Re: server crashing

To: Artur Makówka <juice@xxxxxxxxxxxxx>
Subject: Re: server crashing
From: David Chinner <dgc@xxxxxxx>
Date: Wed, 12 Apr 2006 12:04:27 +1000
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <443B60E8.6070004@xxxxxxxxxxxxx>
References: <443627B1.5090100@xxxxxxxxxxxxx> <20060410015916.GK2732@xxxxxxxxxxxxxxxxx> <443B60E8.6070004@xxxxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4.2.1i
On Tue, Apr 11, 2006 at 09:55:20AM +0200, Artur Makówka wrote:
> >If there are no I/O errors being reported before the filesystem shuts down,
> >can you provide more information of the type of I/O the system is executing
> >when the shutdown occurs?
> 
> I see many similar output to one i already posted, but it happened just 
> AFTER first sucessful mount. the one output i'm pasting right now is ( i 
> think) from just BEFORE crash. Also, there is nothing particular the 
> server is doing durning that time. Durning the time of last 2 crashes it 
> was refreshing awstats for every account in the system, so doing 
> awstats.pl on the list of accounts. But it 'crashed' many times also 
> durning the day   - when awstats was not running. From the 'after' logs 
> i dont see why this shows: "Apr 11 09:47:53 alpha324 kernel: XFS 
> internal error XFS_WANT_CORRUPTED_RETURN at line 298 of file 
> fs/xfs/xfs_alloc.c.  Caller 0xc01f5091"
> 
> what does it mean,

from the line numbers in error report, it looks like we don't have a
single contigous extent large enough in the AG we are allocating
from, so we search the by-size between to find the closest fit
possible.  We searched the last leaf node of the by-size btree,
found an extent and allocated it. We've then called
xfs_alloc_fixup_trees() to update the by-block btree, but we've
failed to find a match of the extent we just allocated from the
by-size btree in the by-block btree.

IOWs, the error indicates the AGF free space btrees are inconsistent or
one of them is corrupted.

> and why xfs_repair didnt repaired it ?

xfs_repair doesn't check the free space btrees, it simply rebuilds
them from scratch. Hence it won't warn about a corrupted AGF btree
during repair. However, after a repair they should be consistent.

OTOH, xfs_check will actually check the AGF btrees for corruption
and consistency. Can you run xfs_check on the filesystem after one of
these errors both before and after you run xfs_repair, and post
the output?

Cheers,

Dave.
-- 
Dave Chinner
R&D Software Enginner
SGI Australian Software Group


<Prev in Thread] Current Thread [Next in Thread>