xfs
[Top] [All Lists]

XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_tran

To: xfs@xxxxxxxxxxx
Subject: XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c
From: "Christian Røsnes" <christian.rosnes@xxxxxxxxx>
Date: Wed, 13 Feb 2008 11:51:51 +0100
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; bh=VTEvSBNmLNZQik3UDLmxYMvpqgalarf91arfp03XQlM=; b=iRbsK4Ya7FaEa1nVbaAGdMjKs9Hk35l5SRGRnkIpO6Gs09Ll0qZIZdvfUVoC7nbHmfdY/aJMwVn+uVjtQ9ajXG0EZWqbzc9zsigMBUOL3iLTDpGTogCC0ueOhnmawBWX/rHe+zg03Pc31S1iOZxHS5NzIqwbw7F7h/izBaVD5oM=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=qTsDhTYEAuh3y0QypgC65ue+XTxM0vGsL9/+0s+wF1t9+wsBFjG1kG/v++f7KGLisasNpok506LJQ50rujBIYjFfD06JYaFEELzHu8IXkoucz47zztTn3+pE8oAGt2ElL4umDoXQdOCg6j0WeHc2Dr+A6lfqm4B0Mn0jUwSilK4=
Sender: xfs-bounce@xxxxxxxxxxx
Over the past month I've been hit with two cases of "xfs_trans_cancel
at line 1150"
The two errors occurred on different raid sets. In both cases the
error happened during
rsync from a remote server to this server, and the local partition
which reported
the error was 99% full (as reported by df -k, see below for details).

System: Dell 2850
Mem: 4GB RAM
OS: Debian 3 (32-bit)
Kernel: 2.6.17.7 (custom compiled)

I've been running this kernel since Aug 2006 without any of these
problems, until a month ago.

I've not used any of the previous kernel in the 2.6.17 series.

/usr/src/linux-2.6.17.7# grep 4K .config
# CONFIG_4KSTACKS is not set


Are there any known XFS problems with this kernel version and nearly
full partitions ?

I'm thinking about upgrading the kernel to a newer version, to see if
it fixes this problem.
Are there any known XFS problems with version 2.6.24.2 ?

Thanks

Christian

--

case logs:

case 1:
Filesystem "sdb1": XFS internal error xfs_trans_cancel at line 1150 of
file fs/xfs/xfs_trans.c.  Caller 0xc0208467
 <c0201cb8> xfs_trans_cancel+0x54/0xe1  <c0208467> xfs_create+0x527/0x563
 <c0208467> xfs_create+0x527/0x563  <c0211d5f> xfs_vn_mknod+0x1a9/0x3bd
 <c01df2ad> xfs_dir2_leafn_lookup_int+0x49/0x452  <c020dbfc>
xfs_buf_free+0x7f/0x84
 <c01d874d> xfs_da_state_free+0x54/0x5a  <c01e0c57>
xfs_dir2_node_lookup+0x95/0xa0
 <c01da14d> xfs_dir2_lookup+0xf5/0x125  <c0163fcc> mntput_no_expire+0x14/0x71
 <c02123f6> xfs_vn_permission+0x1b/0x21  <c0211f86> xfs_vn_create+0x13/0x17
 <c01592ff> vfs_create+0xc2/0xf8  <c01596a1> open_namei+0x16d/0x5b3
 <c014b794> do_filp_open+0x26/0x3c  <c014b8f3> get_unused_fd+0x5a/0xb0
 <c014ba1a> do_sys_open+0x40/0xb6  <c014baa3> sys_open+0x13/0x17
 <c0102697> syscall_call+0x7/0xb
xfs_force_shutdown(sdb1,0x8) called from line 1151 of file
fs/xfs/xfs_trans.c.  Return address = 0xc0214e7d
Filesystem "sdb1": Corruption of in-memory data detected.  Shutting
down filesystem: sdb1
Please umount the filesystem, and rectify the problem(s)

mount options:
/dev/sdb1 on /data type xfs (rw,noatime)

df -k
/dev/sdb1            286380096 283256112   3123984  99% /data

sdb1 is an internal raid. Case 1 occurred last night, and I'm now
about to run repair on that partition.

case2:

Filesystem "sdd1": XFS internal error xfs_trans_cancel at line 1150 of
file fs/xfs/xfs_trans.c.  Caller 0xc0208467
 <c0201cb8> xfs_trans_cancel+0x54/0xe1  <c0208467> xfs_create+0x527/0x563
 <c0208467> xfs_create+0x527/0x563  <c0211d5f> xfs_vn_mknod+0x1a9/0x3bd
 <c03155be> qdisc_restart+0x13/0x152  <c01248bb> in_group_p+0x26/0x2d
 <c01efc86> xfs_iaccess+0xad/0x15b  <c0206f21> xfs_access+0x2b/0x33
 <c01da0fd> xfs_dir2_lookup+0xa5/0x125  <c0163fcc> mntput_no_expire+0x14/0x71
 <c02123f6> xfs_vn_permission+0x1b/0x21  <c0211f86> xfs_vn_create+0x13/0x17
 <c01592ff> vfs_create+0xc2/0xf8  <c01596a1> open_namei+0x16d/0x5b3
 <c014b794> do_filp_open+0x26/0x3c  <c014b8f3> get_unused_fd+0x5a/0xb0
 <c014ba1a> do_sys_open+0x40/0xb6  <c014baa3> sys_open+0x13/0x17
 <c0102697> syscall_call+0x7/0xb
xfs_force_shutdown(sdd1,0x8) called from line 1151 of file
fs/xfs/xfs_trans.c.  Return address = 0xc0214e7d
Filesystem "sdd1": Corruption of in-memory data detected.  Shutting
down filesystem: sdd1
Please umount the filesystem, and rectify the problem(s)

mount options:
/dev/sdd1 on /content/raid03 type xfs (rw,noatime,logbufs=8,nobarrier)

df -k:
/dev/sdd1            1951266816 1925560144  25706672  99% /content/raid03

sdd1 is an external raid. In case 2 I rebooted, then ran xfs_repair
from xfsprogs 2.9.4. And then remounted
the partition, and the partition was ok.

xfs_repair /dev/sdd1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - agno = 25
        - agno = 26
        - agno = 27
        - agno = 28
        - agno = 29
        - agno = 30
        - agno = 31
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - agno = 25
        - agno = 26
        - agno = 27
        - agno = 28
        - agno = 29
        - agno = 30
        - agno = 31
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done


<Prev in Thread] Current Thread [Next in Thread>