xfs
[Top] [All Lists]

Re: [PATCH] xfs: xfs_alloc_fix_minleft can underflow near ENOSPC

To: Dave Chinner <david@xxxxxxxxxxxxx>, Mark Tinguely <tinguely@xxxxxxx>
Subject: Re: [PATCH] xfs: xfs_alloc_fix_minleft can underflow near ENOSPC
From: "Michael L. Semon" <mlsemon35@xxxxxxxxx>
Date: Tue, 17 Feb 2015 10:36:54 -0500
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=KMpjsc61pWT2EKkCxwi8I3g7IsJDKNbzrosPGwYWuH4=; b=QQyZQR6dg54qIBEq05Ah6pYzgG155OOflG5FR9BF1g9qS66CidpHkzQjsDEThlL6Vn Ns/m278ClgW7mE4QHEUcP9eObJwiOQo8fdff3cmF+4HQMDkLB7QDqg1E57ddRo2GrD8C 01Gfp2sL1yQt6XcpxUFAAtXU/7fTF6hdjeF/2sIV1gopBIuajyIviX4OpxsVzsN8BIXT WVgpN9D6qsRluDReTtPPCPY5u/J40VIUN6YY8zIus993Xk4BHM49sraYIbQ8GB48wHPR hv25l+GK2NDM23E/6GJOPTs+8skHDN5vshDdNDZapbn1ay4wMDKIQnwkYc3x2/0ZUihP mxuQ==
In-reply-to: <20150216231716.GB4251@dastard>
References: <1423782857-11800-1-git-send-email-david@xxxxxxxxxxxxx> <54DE8B6D.8010401@xxxxxxx> <20150214232951.GW4251@dastard> <54E16667.1050200@xxxxxxxxx> <54E22A76.40106@xxxxxxx> <20150216231716.GB4251@dastard>
User-agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0
On 02/16/15 18:17, Dave Chinner wrote:
> On Mon, Feb 16, 2015 at 11:35:50AM -0600, Mark Tinguely wrote:
>> Thanks Michael, you don't need to hold your test box for me. I do
>> have a way to recreate these ABBA AGF buffer allocation deadlocks
>> and understand the whys and hows very well. I don't have a community
>> way to make a xfstest for it but I think your test is getting close.
> 
> If you know what is causing them, then please explain how it occurs
> and how you think it needs to be fixed. Just telling us that you know
> something that we don't doesn't help us solve the problem. :(
> 
> In general, the use of the args->firstblock is supposed to avoid the
> ABBA locking order issues with multiple allocations in the one
> transaction by preventing AG selection loops from looping back into
> AGs with a lower index than the first allocation that was made.
> 
> So if you are seeing deadlocks, then it may be that we aren't
> following this constraint correctly in all locations....
> 
> Cheers,
> 
> Dave.

Will this be a classic deadlock that will cause problems when trying to
kill processes and unmount filesystems?  If so, then I was unable to use
generic/224 to trigger a deadlock.  If not, then I'll need a better way
of looking at the problem.

The longest generic/224 loop lasted only 3-1/2 hours, though.  The
fstests enospc group was given some consideration as well.

If this issue does not require a lot of files, I might see if fio can 
be helpful here.

Hints on whether to us a fast kernel or a miserably slow kernel would 
be rather helpful.

My test setup is torn because most of the recent warning messages are
coming from the CONFIG_XFS_WARN kernels.  The i686 Pentium 4 box will be
left that way.  However, the Core 2 box was configured per
Documentation/SubmitChecklist from the kernel source, adding debug XFS
and locktorture.  The locktorture settings are in flux, exercising 
spinlocks at present.  There was a mild halt in I/O for generic/017, but 
that was XFS waiting on kmem-something waiting on a kmemleak function.  
kmemleak was removed, and I'll continue from there.

Thanks!

Michael

<Prev in Thread] Current Thread [Next in Thread>