[Top] [All Lists]

Re: [PATCH 2/2] xfs: mark the xfs-alloc workqueue as high priority

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: [PATCH 2/2] xfs: mark the xfs-alloc workqueue as high priority
From: Tejun Heo <tj@xxxxxxxxxx>
Date: Sun, 11 Jan 2015 01:33:12 -0500
Cc: Eric Sandeen <sandeen@xxxxxxxxxx>, xfs-oss <xfs@xxxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=8XOv2sHVqs5DXpCp/Gh/P3N53o7EnXBCxqIgP5St81M=; b=mzQ0wjAARv0WBtXEj9sBsEjoiYGEqHfvuPmSyOx+H9Ngs/oLtHe3UJWg4SCVr92IOq 3vITk8JL67kMc826uE9SjZcRZJnb9En74PFrtRf0U3NNXlA0ATQ7o7w1qKe0Up6IDdN3 RfavjHa/B+NtPoomy/EHjDwbBgW+T1fOn9TerAipNBgwLlxye58jHGB5BhrCpK1O3NsB tR6bc5Ca1UwaXK8R4OG8jIy+60zMICeNaEfJ1ZWmZPLJRmI4Pr4iCclacbZZQ+1W3goE gfQ7jzRNgxyX72eMX9Iv4aAsJ8RBsoraxQH2a5PNekUzCzaqIW4vSb+1y7/LhmAQ2/fW JOGg==
In-reply-to: <54B1BE0E.7020302@xxxxxxxxxxx>
References: <54B01927.2010506@xxxxxxxxxx> <54B019F4.8030009@xxxxxxxxxxx> <20150109182310.GA2785@xxxxxxxxxxxxxx> <54B03BCC.7040207@xxxxxxxxxxx> <20150110192852.GD25319@xxxxxxxxxxxxxx> <54B1BE0E.7020302@xxxxxxxxxxx>
Sender: Tejun Heo <htejun@xxxxxxxxx>
User-agent: Mutt/1.5.23 (2014-03-12)

On Sat, Jan 10, 2015 at 06:04:30PM -0600, Eric Sandeen wrote:
> > The only reasons that work item would stay there are
> > 
> > * The rescuer is already executing something else from that workqueue
> >   and that one is stuck.
> I'll have to look at that.  I hope I still have access to the core...

Yes, if this is happening, the rescuer worker which has the name of
the workqueue would be stuck somewhere.

> > * The worker pool is still considered to be making forward progress -
> >   there's a worker which isn't blocked and can burn CPU cycles.
> AFAICT, the first thing in the pool is the xffs_end_io blocked waiting for 
> the ilock.
> I assume it's only the first one that matters?

Whatever work item which is executing on that pool on that CPU.
Checking the tasks which are runnable on that CPU should show it.

> > Again, if xfs is using workqueue correctly, that work item shouldn't
> > get stuck at all.  What other workqueues are doing is irrelevant.
> and yet here we are; one of us must be missing something.  It's quite
> possibly me :) but we definitely have this thing wedged, and moving
> the xfsalloc item to the front via high priority did solve it.  Not saying
> it's the right solution, just a data point.

It sure is possible that workqueue is misbehaving but I'm pretty
doubtful that it'd be, especially given that xfs issue has been around
for quite a while, which excludes recent regressions in the rescuer
logic, and that there hasn't been any other case of failed forward
progress guarantee.



<Prev in Thread] Current Thread [Next in Thread>