Hi, Thanks for the data in the previous thread: http://oss.sgi.com/archives/xfs/2013-04/msg00327.html The data confirms Dave's theory where we are going off the end of the unlinked list when attempti
It's better to use trace-cmd for this. it will result in less dropped events. i.e.: $ trace-cmd record -e xfs_iunlink\* ... reproduce ... ^C $ trace-cmd report > trace.output I would suggest that the
... ... Good points, thanks Dave. A v2 that pulls up the tracepoints towards function entry is appended. Brian -- fs/xfs/linux-2.6/xfs_trace.h | 2 ++ fs/xfs/xfs_inode.c | 4 ++++ 2 files changed, 6 in
... ... Good points, thanks Dave. A v2 that pulls up the tracepoints towards function entry is appended. Brian From: Brian Foster <bfoster@xxxxxxxxxx> Date: Mon, 15 Apr 2013 18:16:24 -0400 Subject: [
Hi Brain, I want to ask a question, according to the shutdown trace. The ino in xfs_iunlink_remove is 0x113, why xfs_imap got ino=0xffffffff ? -- xfs_imap -- module("xfs").function("xfs_imap@fs/xfs/x
Hi Brain, If it is because NULLAGINO is passed in to xfs_inotobp(). Can I move the following two lines before xfs_inotobp? For example: 1767 while (next_agino != agino) { 1768 /* 1769 * If the last i
On Apr 16, 2013, at 8:48 PM, çææ <yongtaofu@xxxxxxxxx> wrote: Hi Brain, Can I change as following? ASSERTS are no-ops in a non-debug kernel, so this won't change any behavior. I hope we'll know more
Hi Eric, The shutdown issue is still not reproduced yet. But I get the following error today during test. Apr 18 07:42:51 10 kernel: Call Trace: Apr 18 07:42:51 10 kernel: [<ffffffffa02d91ef>] ? xfs_
Hi Brain and Eric, If the problem is the agno can't be found in the unlinked list. Can we just bypass it instead of passing ino=0xffffffff to xfs_inotobp? Thank you. 2013/4/18 <yongtaofu@xxxxxxxxx> H
Hi Brian and Eric, Can I change as following to bypass it? -- a/xfs_inode.c +++ b/xfs_inode.c @@ -1764,7 +1764,7 @@ xfs_iunlink_remove( */ next_agino = be32_to_cpu(agi->agi_unlinked[bucket_index]); l
This is probably not a wise thing to do. The problem we're seeing here is indicative of a potentially larger problem than this particular error path. An inode is being unlinked and inactivated, but w
This is probably not a wise thing to do. The problem we're seeing here is indicative of a potentially larger problem than this particular error path. An inode is being unlinked and inactivated, but
Hi Brian and Eric, Here's the meta_dump file of one server xfs repair log. And again this happens exactly when one of the glusterfs finished rebalance. https://docs.google.com/file/d/0B7n2C4T5tfNCdDF
Thanks, we'll take a look. Just to double check, in the kernel that ran the tracepoints, did you use brian's 2nd version of the patch? I want to make sure the tracepoints were at the top of the funct
Understood. We've been trying very hard to reproduce ourselves to make it easier to debug, but haven't been able to reproduce at all so far. This process allows us to make _some_ progress on the issu
Understood. We've been trying very hard to reproduce ourselves to make it easier to debug, but haven't been able to reproduce at all so far. This process allows us to make _some_ progress on the iss
Dear Brian and Eric, kernel kernel-2.6.32-279.19.1.el6.x86_64.rpm still have this problem I build the kernel from this srpm https://oss.oracle.com/ol6/SRPMS-updates/kernel-2.6.32-279.19.1.el6.src.rpm
Same issue, one file was unlinked twice in a race: == ino 0x6b133 == <...>-4477 [003] 2721.176790: xfs_iunlink: dev 8:16 ino 0x6b133 <...>-4477 [003] 2721.176839: xfs_iunlink_remove: dev 8:16 ino 0x6