[Top] [All Lists]

Re: [PATCH 1/6] Extend completions to provide XFS object flush requireme

To: Matthew Wilcox <matthew@xxxxxx>
Subject: Re: [PATCH 1/6] Extend completions to provide XFS object flush requirements
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Thu, 26 Jun 2008 08:49:11 -0400
Cc: xfs@xxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
In-reply-to: <20080626124009.GY4392@xxxxxxxxxxxxxxxx>
References: <1214455277-6387-1-git-send-email-david@xxxxxxxxxxxxx> <1214455277-6387-2-git-send-email-david@xxxxxxxxxxxxx> <20080626112612.GW4392@xxxxxxxxxxxxxxxx> <20080626113209.GK11558@disturbed> <20080626114242.GX4392@xxxxxxxxxxxxxxxx> <20080626122112.GL11558@disturbed> <20080626124009.GY4392@xxxxxxxxxxxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-05-17)
On Thu, Jun 26, 2008 at 06:40:09AM -0600, Matthew Wilcox wrote:
> On Thu, Jun 26, 2008 at 10:21:12PM +1000, Dave Chinner wrote:
> > On Thu, Jun 26, 2008 at 05:42:42AM -0600, Matthew Wilcox wrote:
> > > Then let's leave it as a semaphore.  You can get rid of the sema_t if
> > > you like, but I don't think that turning completions into semaphores is
> > > a good idea (because it's confusing).
> > 
> > So remind me what the point of the semaphore removal tree is again?
> To remove the semaphores which don't need to be semaphores any more.
> > As Christoph suggested, I can put this under another API that
> > is implemented using completions. If I have to do that in XFS,
> > so be it....
> You could, yes.  But you could just use completions directly ...
> > The main reason for this that we've just uncovered the fact that the
> > way XFS uses semaphores is completely unsafe [*] on x86/x86_64 for
> > kernels prior to the new generic semaphores.
> > 
> > [*] 2.6.20 panics in up() because of this race when I/O completion
> > (the up call) races with a simultaneous down() (iowaiter):
> > 
> >     T1              T2
> >     up()            down()
> >                     kmem_free()
> > 
> > When the down() call completes, the up() call can still be
> > referencing the semaphore, and hence if we free the structure after
> > the down call then the up() will reference freed memory.  This is
> > probably the cause of many unexplained log replay or unmount panics
> > that we've been hitting for years with buffers that been freed while
> > apparently still in use....
> This is exactly the kind of thing completions were supposed to be used
> for.  T1 should be calling complete() and T2 should be calling
> wait_for_completion().

Please read Dave's introductionary mail.  What XFS wants if completions
with a little bit extra, so he implemented the little bit extra.  This
little bit extra is pretty well described in the mail starting this

<Prev in Thread] Current Thread [Next in Thread>