xfs
[Top] [All Lists]

Re: XFS filesystem shutting down on linux 2.6.28.9 (xfs_rename)

To: Gabriel Barazer <gabriel@xxxxxxxx>
Subject: Re: XFS filesystem shutting down on linux 2.6.28.9 (xfs_rename)
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Wed, 22 Jul 2009 23:11:33 -0500
Cc: xfs@xxxxxxxxxxx
In-reply-to: <000c01ca0ae0$e85420a0$b8fc61e0$@fr>
References: <000c01ca0ae0$e85420a0$b8fc61e0$@fr>
User-agent: Thunderbird 2.0.0.22 (Macintosh/20090605)
Gabriel Barazer wrote:
> Hi,
> 
> I recently put a NFS file server into production, with mostly XFS volumes on 
> LVM. The server was quite low on traffic until this morning and one of the 
> filesystems crashed twice since this morning with the following backtrace:
> 
> Filesystem "dm-24": XFS internal error xfs_trans_cancel at line 1164 of file 
> fs/xfs/xfs_trans.c.  Caller 0xffffffff811b09a7
> Pid: 2053, comm: nfsd Not tainted 2.6.28.9-filer #1
> Call Trace:
>  [<ffffffff811b09a7>] xfs_rename+0x4a1/0x4f6
>  [<ffffffff811b1806>] xfs_trans_cancel+0x56/0xed
>  [<ffffffff811b09a7>] xfs_rename+0x4a1/0x4f6
...

> xfs_force_shutdown(dm-24,0x8) called from line 1165 of file 
> fs/xfs/xfs_trans.c.  Return address = 0xffffffff811b181f
> Filesystem "dm-24": Corruption of in-memory data detected.  Shutting down 
> filesystem: dm-24
> 
> The two crashed are related to the same function: xfs_rename.

Can you do objdump -d xfs.ko | grep "xfs_rename\|xfs_trans_cancel" and
maybe we can see which call to xfs_trans_cancel in xfs_rename this was.

The problem relates to canceling a dirty transaction on an error path.

-Eric

> I _really_ cannot upgrade to 2.6.29 or later because of the "reconnect_path: 
> npd != pd" bug and the maybe related radix-tree bug ( 
> http://bugzilla.kernel.org/show_bug.cgi?id=13375 ) affecting all kernel 
> version afeter 2.6.28.
> 
> Unmounting then remounting the filesystem allow to access the mountpoint 
> again without any error message or apparent file corruption.
> This filesystem is used by ~30 NFS clients and contains about 5M files 
> (100GB).
> 
> Before using the volume over NFS, there was only local activity (rsync 
> syncing) and we didn't get any error.
> 
> I expect to see this crash again in a few hours except if the volume is 
> really corrupted. Does a full filesystem copy to a newly created volume would 
> have a chance to solve the problem?
> 
> Thanks,
> 
> Gabriel
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs
> 

<Prev in Thread] Current Thread [Next in Thread>