xfs
[Top] [All Lists]

Re: Corruption of in-memory data detected - on heavy hard linking

To: xfs@xxxxxxxxxxx
Subject: Re: Corruption of in-memory data detected - on heavy hard linking
From: Christian Affolter <c.affolter@xxxxxxxxxxxxxxxxx>
Date: Mon, 11 Aug 2008 14:26:30 +0200
In-reply-to: <20080805001952.GI6119@disturbed>
References: <48876D03.8010804@xxxxxxxxxxxxxxxxx> <20080725052051.GA26367@xxxxxxxxxxxxx> <489732B2.7000201@xxxxxxxxxxxxxxxxx> <20080805001952.GI6119@disturbed>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Thunderbird 2.0.0.16 (X11/20080805)
Hi Dave

On Wed, Jul 23, 2008 at 07:40:19PM +0200, Christian Affolter wrote:
Kernel-Error:
Filesystem "sdc1": XFS internal error xfs_trans_cancel at line 1163 of file fs/xfs/xfs_trans.c. Caller 0xffffffff803a4fcf
Pid: 22816, comm: cp Not tainted 2.6.24-gentoo-r8 #1
2.6.24 is pretty old.  Did you try with a recent kernel?  We had some
fixes for in-core memory corruption although I don't remember one in
this area.
I finally found the time to update the kernel to a recent 2.6.26 version.

Unfortunately the problem still exists:
Filesystem "dm-3": XFS internal error xfs_trans_cancel at line 1163 of file fs/xfs/xfs_trans.c. Caller 0xffffffff803a6672
Pid: 12584, comm: cp Not tainted 2.6.26-gentoo #1

Ok, what we need is the following. First, try to reproduce the
problem on a small filesystem (say a few GB). Once you've reproduced
the problem, unmount and remount the filesystem to get the log
replayed, then take a xfs_metadump image of the filesystem. Put the
metadump image somewhere that can be downloaded (ftp/web site) and
let us know where it is.
Please excuse the delay, it took some time to reproduce the issue with newly generated nonsensitive data...

However while looking at the meta dump (with the help of the strings command), a lot of non-existing file names appears. Non-existing in the sense of not present on this device, they may exist on other devices, but they definitely were never on the dumped device (the device was filled with /dev/zero before creating the xfs filesystem).

Therefor I'm a bit scared to place the dump publicly on the internet, might it be possible to put it somewhere with user/pw protection and hand the credentials to you privately?

On the other hand maybe I misunderstood the intention/working of xfs_metadump...

The dump was taken as follow:
xfs_metadump -g /dev/sdc2 /var/tmp/xfs_sdc2_meta.dump

If this is anything like the previous problem I found and fixed,
then it will be a corner-case bug that is only triggered by a
specific layout of free space and we need the filesystem image
to be able to work out exactly what corner case is broken....

Before the shutdown happens the copy command receives a
"No space left on device" error:
cp: cannot create regular file `[file name snipped': No space left on device
cp: cannot create regular file `[file name snipped]': Input/output error

Although the device has more than 50% free space as well as free inodes.

It will be an AG that is out of space, not the entire filesystem.

Cheers,

Dave.

Many thanks for your help!
Chris


<Prev in Thread] Current Thread [Next in Thread>