[Top] [All Lists]

Re: 3.14-rc2 XFS backtrace because irqs_disabled.

To: Oleg Nesterov <oleg@xxxxxxxxxx>
Subject: Re: 3.14-rc2 XFS backtrace because irqs_disabled.
From: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Date: Sat, 15 Feb 2014 18:05:20 +0000
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>, Dave Chinner <david@xxxxxxxxxxxxx>, Dave Jones <davej@xxxxxxxxxx>, Eric Sandeen <sandeen@xxxxxxxxxxx>, Linux Kernel <linux-kernel@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <20140215174345.GA24799@xxxxxxxxxx>
References: <20140212211421.GP18016@xxxxxxxxxxxxxxxxxx> <CA+55aFyobyUNFo=3rpdbxTqgV7OQetCKbCfwEEbgxUcT-1+30w@xxxxxxxxxxxxxx> <20140213174020.GA14455@xxxxxxxxxx> <CA+55aFxwozCQ05axLB02R3huX8sj=20EoFfw0cSDDL8fBE_Y6Q@xxxxxxxxxxxxxx> <20140215052531.GX18016@xxxxxxxxxxxxxxxxxx> <20140215142700.GA15540@xxxxxxxxxx> <20140215152251.GY18016@xxxxxxxxxxxxxxxxxx> <20140215153631.GZ18016@xxxxxxxxxxxxxxxxxx> <20140215155838.GA18016@xxxxxxxxxxxxxxxxxx> <20140215174345.GA24799@xxxxxxxxxx>
Sender: Al Viro <viro@xxxxxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Sat, Feb 15, 2014 at 06:43:45PM +0100, Oleg Nesterov wrote:
> > So basically we want a different condition for "can we just go ahead and
> > free that sucker", right?  Instead of "it's on the list, shan't free it"
> > it ought to be something like "it's on the list or it is referenced by
> > ksiginfo".  Locking will be interesting, though... ;-/
> I guess yes... send_sigqueue() checks list_empty() too, probably nobody else.

The trouble being, we might end up with
        Q picked by collect_signal and and stuff into ksiginfo
        Q resubmitted by timer code
        Q picked by *another* thread into its ksiginfo
        the first thread finally done with signal
and at that point we still have a reference in the second thread's ksiginfo.
Hell knows - my first reflex in that kind of situation is to replace
that flag with refcount, so that timer code would hold a reference from
timer_create(2) to timer_delete(2), send_sigqueue() would bump it and
dismiss_siginfo() - drop the sucker.  But that means either grabbing
siglock in dismiss_siginfo() or making the counter atomic; either way
it's a cacheline ping-pong.  Atomic counter is less painful in that respect -
it would be right next to the list, so we dirty that cacheline anyway...

I'm still trying another approach (slightly bigger ksiginfo used to store
all variants with si_code >= 0), but it has messiness of its own; we'll
see how it goes...

<Prev in Thread] Current Thread [Next in Thread>