xfs
[Top] [All Lists]

Re: xfs_file_splice_read: possible circular locking dependency detected

To: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Subject: Re: xfs_file_splice_read: possible circular locking dependency detected
From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Fri, 9 Sep 2016 19:06:29 -0700
Cc: Dave Chinner <david@xxxxxxxxxxxxx>, CAI Qian <caiqian@xxxxxxxxxx>, linux-xfs <linux-xfs@xxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=saMtLXpv5SiKktd8swEwH4aDCu99NKKc0EjNKbHeEK0=; b=rwNAz0W5sAm9yXZlPJZ9gzdfzCeThk/tSfNWZWK75c4FUhmDXPvp7OssyHhkG9Bzf4 fxuOP4TwW1NFf6gu4oHw9PDPE0Xo3bvaMU2HPFU5wM2wa8TNom1xG0VPgFV/BAJPwwsg vgjMjGvWDEBLVedJlcniNOma7oagfL8d3tHiARSHoKhCQhkDkktMFhcQjq76DSbmxReF KBY++KbPY2+yfP8mQNV/d9HzOssawQuMMw4yhH8aadM0CR1Zf1PtSAlYb1MGnP2mgvjP YMBZjmYhRd/BbTgyxaOETb9Popfeugnl0YVJAs0nLR/+fnTQ9XsEPcG1iXtbX5iSu8Hb O4lQ==
In-reply-to: <20160909221945.GQ2356@xxxxxxxxxxxxxxxxxx>
References: <1450936953.949798.1473348551588.JavaMail.zimbra@xxxxxxxxxx> <20160908175632.GH2356@xxxxxxxxxxxxxxxxxx> <CA+55aFzg+Q0DzFNBR9TeL13_yfrfFwHu9OrZe--Zpje0EeN4Cw@xxxxxxxxxxxxxx> <20160908213835.GY30056@dastard> <20160908235521.GL2356@xxxxxxxxxxxxxxxxxx> <20160909015324.GD30056@dastard> <CA+55aFzohsUXj_3BeFNr2t50Wm=G+7toRDEz=Tk7VJqP3n1hXQ@xxxxxxxxxxxxxx> <CA+55aFxrqCng2Qxasc9pyMrKUGFjo==fEaFT1vkH9Lncte3RgQ@xxxxxxxxxxxxxx> <20160909023452.GO2356@xxxxxxxxxxxxxxxxxx> <CA+55aFwHQMjO4-vtfB9-ytc=o+DRo-HXVGckvXLboUxgpwb7_g@xxxxxxxxxxxxxx> <20160909221945.GQ2356@xxxxxxxxxxxxxxxxxx>
Sender: linus971@xxxxxxxxx
On Fri, Sep 9, 2016 at 3:19 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> Cooking it...  The thing I really hate about the use of pipe_buffer is that
> we want to keep the "use on-stack array for default pipe size" trick, and
> pipe_buffer is fatter than I'd like.  Instead of pointer + two numbers +
> something to indicate whether it's picked from page cache or something we'd
> allocated we get get pointer + int + int + pointer + int + long, which turns
> into 5 words on 64bit.  With 16-element array of those on stack frame, it's
> not nice - more than half kilobyte of stack space with ->read_iter() yet to
> be called...  bvec would be better (60% savings boils down to 384 bytes
> shaved off that thing), but we'd need to play games with encoding the "is
> it page cache or not" bit somewhere in it.

No, please don't play games like that.

I think you'd be better off with just a really small on-stack case
(like maybe 2-3 entries), and just allocate anything bigger
dynamically. Or you could even see how bad it is if you just
force-limit it to max 4 entries or something like that and just do
partial writes.

>From when I looked at things (admittedly a *long* time ago), the
buffer sizes for things like read/write system calls were *very*
skewed.

There's a lot of small stuff, then there is the stuff that actually
honors st.st_blksize (normally one page), and then there is the big
buffers stuff.

And the thing is, the big buffers are almost never worth it. It's
often better to have a tight loop over smaller data than bouncing lots
of data into buffers and then out of buffers.

So I suspect all the "let's do many pages in one go" stuff is actually
not worth it. Especially since the pipes will basically force a wait
event when the pipe buffers fill up anyway.

So feel free to try maxing out using only a small handful of
pipe_buffer entries. Returning partial IO from splice() is fine.

                   Linus

<Prev in Thread] Current Thread [Next in Thread>