xfstests 011 trips an ASSERT
Chandra Seetharaman
sekharan at us.ibm.com
Fri Feb 25 20:00:44 CST 2011
Hi Dave,
Did the git bisect on the merge tree and end up up on this commit as the
offensive one.
----------------------------
3f16b9850743b702380f098ab5e0308cd6af1792 is the first bad commit
commit 3f16b9850743b702380f098ab5e0308cd6af1792
Author: Dave Chinner <dchinner at redhat.com>
Date: Tue Dec 21 12:29:01 2010 +1100
xfs: introduce new locks for the log grant ticket wait queues
The log grant ticket wait queues are currently protected by the log
grant lock. However, the queues are functionally independent from
each other, and operations on them only require serialisation
against other queue operations now that all of the other log
variables they use are atomic values.
Hence, we can make them independent of the grant lock by introducing
new locks just to protect the lists operations. because the lists
are independent, we can use a lock per list and ensure that reserve
and write head queuing do not contend.
To ensure forced shutdowns work correctly in conjunction with the
new fast paths, ensure that we check whether the log has been shut
down in the grant functions once we hold the relevant spin locks but
before we go to sleep. This is needed to co-ordinate correctly with
the wakeups that are issued on the ticket queues so we don't leave
any processes sleeping on the queues during a shutdown.
Signed-off-by: Dave Chinner <dchinner at redhat.com>
Reviewed-by: Christoph Hellwig <hch at lst.de>
------------------------------
Thanks
chandra
On Fri, 2011-02-25 at 13:06 +1100, Dave Chinner wrote:
> On Thu, Feb 24, 2011 at 01:54:40PM -0800, Chandra Seetharaman wrote:
> > Hello,
> >
> > I ran the xfs tests on my POWER box and was tripped by an ASSERT while
> > running test 011. I have not seen it before in 2.6.37, so started with a
> > git-bisect from 2.6.37 to 2.6.38-rc6 and ended at this commit:
> > --------------------
> > Author Linus Torvalds <torvalds at linux-foundation.org>
> > Tue, 11 Jan 2011 19:42:06 +0000 (11:42 -0800)
> > committer Linus Torvalds <torvalds at linux-foundation.org>
> > Tue, 11 Jan 2011 19:42:06 +0000 (11:42 -0800)
> > commit 7bc4a4ce68f8c6d064ea949446852e996526f692
> > Merge branch 'for-linus-merged' of git://oss.sgi.com/xfs/xfs
> >
> > * 'for-linus-merged' of git://oss.sgi.com/xfs/xfs: (47 commits)
> > xfs: convert grant head manipulations to lockless algorithm
> > xfs: introduce new locks for the log grant ticket wait queues
> > xfs: convert log grant heads to atomic variables
> > xfs: convert l_tail_lsn to an atomic variable.
> > xfs: convert l_last_sync_lsn to an atomic variable
> > xfs: make AIL tail pushing independent of the grant lock
> > xfs: use wait queues directly for the log wait queues
> > xfs: combine grant heads into a single 64 bit integer
> > xfs: rework log grant space calculations
> > xfs: fact out common grant head/log tail verification code
> > xfs: convert log grant ticket queues to list heads
> > xfs: use AIL bulk delete function to implement single delete
> > xfs: use AIL bulk update function to implement single updates
> > xfs: remove all the inodes on a buffer from the AIL in bulk
> > xfs: consume iodone callback items on buffers as they are processed
> > xfs: reduce the number of AIL push wakeups
> > xfs: bulk AIL insertion during transaction commit
> > xfs: clean up xfs_ail_delete()
> > xfs: Pull EFI/EFD handling out from under the AIL lock
> > xfs: fix EFI transaction cancellation.
> > ...
> > -----------------------------
> >
> > Since it included so many patches, I thought i will start a git-bisect
> > on xfs git tree at git://oss.sgi.com/xfs/xfs, and ended up at
> > 92f1c008ae79e32b83c0607d184b194f302bb3ee, which is the same commit as
> > above.
>
> you landed on the the merge commit. mark that commit as bad, and the
> bisect should continue down the merged branch. Otherwise you could
> try skipping the merge commit and again the bisect should continue
> down the branch.
>
> > Two questions:
> > 1. Has anybody seen this problem ? I see this ASSERT has been added in
> > that patch set, is it a false-positive ?
>
> The ASSERT() being triggered is racy. We are comparing the state of
> two different atomic variables that are updated without
> synchronisation. Hence after sampling the first, the second could be
> changed before it is sampled and hence cause the assert to fail.
>
> I've never seen it be triggered, so I'm interested to know if this
> is reproducable (i.e. a real problem) or whether it is a false
> trigger (i.e. update race).
>
> Cheers,
>
> Dave.
More information about the xfs
mailing list