On Sat, Jan 18, 2014 at 12:44:53AM -0800, David Miller wrote:
> From: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> Date: Sat, 18 Jan 2014 08:27:30 +0000
>
> > BTW, would sockets benefit from having ->sendpages() that would take an
> > array of (page, offset, len) triples? It would be trivial to do and
> > some of the helpers that are falling out of writing that writev-based
> > default_file_splice_write() look like they could be reused for
> > calling that one... Dave?
>
> That's originally how the sendpage method was implemented, but back then
> Linus asked us to only pass one page at a time.
>
> I don't remember the details beyond that.
FWIW, I wonder if what we are doing with ->msg_iov is the right thing.
We modify the iovecs in array as we drain it. And that's inconvenient
for at least some callers (see e.g. complaints in fs/ncpfs about the
need to copy the array, etc.).
What if we embed iov_iter into the sucker and replace memcpy_{to,from}iovec*
with variants taking iov_iter *? If nothing else, it'll be marginally more
efficient (no more skipping the already-emptied iovecs) and it seems to be
more convenient for callers. If we are lucky, that might even eliminate
the need of ->sendpage() - just set the iov_iter over <page,offset,size>
array instead of iovec one and let ->sendmsg() do the smart thing if it
knows how. I hadn't done comparison of {tcp,udp}_send{page,msg}, though -
there might be dragons... Even if that will turn out to be infeasible,
it will at least drive the kmap/kunmap done by sock_no_sendpage() down
into memcpy_from_iter(), turning them into kmap_atomic/kunmap_atomic.
The obvious price is that kernel-side msghdr diverges from the userland
one, so copy_msghdr_from_user() needs to deal with that, but I really
doubt that you'll find a load where the price of copying it in two
chunks instead of one would be measurable.
What else am I missing?
|