xfs
[Top] [All Lists]

Re: Corruption of in-memory data detected - on heavy hard linking

To: Christian Affolter <c.affolter@xxxxxxxxxxxxxxxxx>
Subject: Re: Corruption of in-memory data detected - on heavy hard linking
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Tue, 5 Aug 2008 10:19:52 +1000
Cc: xfs@xxxxxxxxxxx
In-reply-to: <489732B2.7000201@xxxxxxxxxxxxxxxxx>
Mail-followup-to: Christian Affolter <c.affolter@xxxxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
References: <48876D03.8010804@xxxxxxxxxxxxxxxxx> <20080725052051.GA26367@xxxxxxxxxxxxx> <489732B2.7000201@xxxxxxxxxxxxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-05-17)
On Mon, Aug 04, 2008 at 06:47:46PM +0200, Christian Affolter wrote:
> Hi
>
>> On Wed, Jul 23, 2008 at 07:40:19PM +0200, Christian Affolter wrote:
>>> Kernel-Error:
>>> Filesystem "sdc1": XFS internal error xfs_trans_cancel at line 1163 
>>> of  file fs/xfs/xfs_trans.c.  Caller 0xffffffff803a4fcf
>>> Pid: 22816, comm: cp Not tainted 2.6.24-gentoo-r8 #1
>>
>> 2.6.24 is pretty old.  Did you try with a recent kernel?  We had some
>> fixes for in-core memory corruption although I don't remember one in
>> this area.
>
> I finally found the time to update the kernel to a recent 2.6.26 version.
>
> Unfortunately the problem still exists:
> Filesystem "dm-3": XFS internal error xfs_trans_cancel at line 1163 of  
> file fs/xfs/xfs_trans.c.  Caller 0xffffffff803a6672
> Pid: 12584, comm: cp Not tainted 2.6.26-gentoo #1

Ok, what we need is the following. First, try to reproduce the
problem on a small filesystem (say a few GB). Once you've reproduced
the problem, unmount and remount the filesystem to get the log
replayed, then take a xfs_metadump image of the filesystem. Put the
metadump image somewhere that can be downloaded (ftp/web site) and
let us know where it is.

If this is anything like the previous problem I found and fixed,
then it will be a corner-case bug that is only triggered by a
specific layout of free space and we need the filesystem image
to be able to work out exactly what corner case is broken....

> Before the shutdown happens the copy command receives a
> "No space left on device" error:
> cp: cannot create regular file `[file name snipped': No space left on device
> cp: cannot create regular file `[file name snipped]': Input/output error
>
> Although the device has more than 50% free space as well as free inodes.

It will be an AG that is out of space, not the entire filesystem.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx


<Prev in Thread] Current Thread [Next in Thread>