xfs
[Top] [All Lists]

Re: [ASSERT failure] transaction reservations changes bad?

To: Jeff Liu <jeff.liu@xxxxxxxxxx>
Subject: Re: [ASSERT failure] transaction reservations changes bad?
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Wed, 27 Mar 2013 13:03:31 +1100
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <51517506.1020906@xxxxxxxxxx>
References: <20130312062001.GJ21651@dastard> <20130312062531.GK21651@dastard> <513EE274.6090808@xxxxxxxxxx> <20130312103138.GN21651@dastard> <513F0C07.1060000@xxxxxxxxxx> <513F17F3.1010204@xxxxxxxxxx> <20130312120545.GO21651@dastard> <51517506.1020906@xxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Tue, Mar 26, 2013 at 06:14:30PM +0800, Jeff Liu wrote:
> On 03/12/2013 08:05 PM, Dave Chinner wrote:
> > On Tue, Mar 12, 2013 at 07:56:35PM +0800, Jeff Liu wrote:
> >> More info, 3.7.0 is the oldest kernel on my environment, I ran into the
> >> same problem.
> > 
> > Thanks for following up so quickly, Jeff. So the problem is that a
> > new test is tripping over a bug that has been around for a while,
> > not that it is a new regression.
> > 
> > OK, so I'll expunge that from my testing for the moment as I don't
> > ahve time to dig in and find out what the cause is right now. If
> > anyone else wants to.... :)
> 
> I did some further tests to nail down this issue, just posting the analysis 
> result here,
> it might be of some use when we revising it again.
> 
> The disk is formated with Dave's previous comments, i.e.
> mkfs.xfs -f -b size=512 -d agcount=16,su=256k,sw=12 -l su=256k,size=2560b 
> /dev/xxx
> 
> First of all, looks this bug stayed in hiding for years since I can reproduce 
> it between upstream
> 3.0 to 3.9.0-rc3, the oldest kernel I have tried is 2.6.39 which has the same 
> problem.

If you mount 2.6.39 with "-o nodelaylog", does the problem go away?

> IMHO, looks the major cause is related to the 'sunit' parameter,
> since it would affect the log space unit calculations by
> '2*log->l_mp->m_sb.sb_logsunit' at xlog_ticket_alloc().  However,
> we don't include this factor into consideration at mkfs or mount
> stage, should we take it into account?

That's what I suspected was the problem. i.e. that the log was too
small for the given configuration.

The question is this: how much space do we need to reserve. I'm
thinking a minimum of 4*lsu - 2*lsu for the existing CIL context, and
another 2*lsu for any queued ticket waiting for space to come
available.

I haven't thought a lot about it, though, and I have a little demon
sitting on my shoulder nagging me about specific thresholds whether
they need to play a part in this. e.g. no single transaction can be
larger than half the log; AIL push thresholds of 25% of log space;
background CIL commit threshold of 12.5% of the log...

So it's not immediately clear to me how much bigger the log needs to
be...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>