[Top] [All Lists]

Re: [PATCH 44/49] xfs: Reduce allocations during CIL insertion

To: Mark Tinguely <tinguely@xxxxxxx>
Subject: Re: [PATCH 44/49] xfs: Reduce allocations during CIL insertion
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Fri, 26 Jul 2013 10:32:08 +1000
Cc: "Michael L. Semon" <mlsemon35@xxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <51F13E10.1010805@xxxxxxx>
References: <1374215120-7271-1-git-send-email-david@xxxxxxxxxxxxx> <1374215120-7271-45-git-send-email-david@xxxxxxxxxxxxx> <51EEF26F.5040001@xxxxxxx> <51EEF949.9020104@xxxxxxxxx> <51EFD68A.40400@xxxxxxx> <20130725002108.GA11222@dastard> <51F13E10.1010805@xxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Thu, Jul 25, 2013 at 10:02:40AM -0500, Mark Tinguely wrote:
> On 07/24/13 19:21, Dave Chinner wrote:
> >On Wed, Jul 24, 2013 at 08:28:42AM -0500, Mark Tinguely wrote:
> >>If you could please redo the test and get the stack traces with
> >>/proc/sysrq-trigger and if you kernel works with crash, a core dump.
> >>For the stack trace, I mostly want to know if it has several
> >>"xlog_grant_head_wait" entries in it, because ...
> >>
> >>...I seemed to have triggered a couple log space reservation hangs
> >>with fsstress one XFS partition and a mega-copy on another
> >>partition, but will have to graft the new XFS tree onto a Linux 3.10
> >>kernel to get crash (and one of my sata controllers) to work again.
> >
> >They are unrelated to this patchset.
> >
> >Somewhere in the code there
> >is a mismatch between what we reserve as the base requirement for an
> >actual log write and what the CIL actually steals, and that is, most
> >likely, what is leading to log hangs.
> >
> >This is demonstratable in the fact that generic/070 on 512 byte
> >block size filesystems regularly hits a transaction reservation
> >exhausted assert failure on transaction commit of the periodic log
> >dummy transaction on my test rigs.
> >
> >Cheers,
> >
> >Dave.
> In testing patch 44, I did not trip over any cil stealing asserts
> before the hang. I think the cil steal assert is a different and a
> legitimate complaint. When I tripped over the ASSERT in with the v3
> inode enabled, the writeid only reserves space for the sb but there
> were occasions of root btree and attribute fork entry that were also
> logged.
> patch 43 runs for hours without incident. Previous to this series, I
> ran the same tests with parent pointer testing with much higher log
> reservations for day or two and never got a hang.
> I tested patch 44 with copy like tests and both times it hung both
> times - not a convincing number of tests. A quick look, I see an
> empty AIL, empty CIL, the CTX is using 0 bytes, doesn't look like
> there are any cil pushes going nor any older ctx, the ctx has an
> empty ticket reservation. The log tail is  0xd000014d7 and
> reserve/grant is 0xe00204d04. The next reservation is for a rename
> transaction that uses just over the log space left. There has to be
> a log space leak. I will go back patch 43 on one machine and patch
> 44 on another and make sure it is patch 44 is causing the problem.

Right, a patch that makes transaction commits go faster is likely to
cause a pre-existing reservation leak to leak faster....


Dave Chinner

<Prev in Thread] Current Thread [Next in Thread>