On Wed, Apr 08, 2015 at 06:21:04PM +0200, Christoph Hellwig wrote:
> On Tue, Apr 07, 2015 at 05:07:47PM -0400, J. Bruce Fields wrote:
> > On Tue, Apr 07, 2015 at 05:35:44PM +0200, Christoph Hellwig wrote:
> > > We want to drop all I/O path locks when recalling layouts, and that
> > > includes
> > > i_mutex for the write path. Without this we get stuck processe when
> > > recalls
> > > take too long.
> >
> > Also if the writer is an nfsd thread than we'd rather just error out
> > than wait.
(To be clear: ACK to this patch as far as I'm concerned, I've got
another concern but we need this fix regardless.)
> We have no way to know we are called by nfsd here unfortunately.
I was imagining the possible deadlock here as mostly theoretical, but
now that I think of it it doesn't sound unlikely at all:
- file is under heavy write load
- conflicting operation breaks layout
- nfsd threads all block in writes to that file
- no nfsd threads available to service layout return
- recall times out, client fenced.
Ugh.
--b.
|