netdev
[Top] [All Lists]

Re: [RFC] cleaning up struct sock

To: davem@xxxxxxxxxx (David S. Miller)
Subject: Re: [RFC] cleaning up struct sock
From: Steven Whitehouse <steve@xxxxxxxxxxxxxx>
Date: Tue, 11 Dec 2001 11:14:39 +0000 (GMT)
Cc: acme@xxxxxxxxxxxxxxxx, jschlst@xxxxxxxxx, ncorbic@xxxxxxxxxxx, eis@xxxxxxxxxxxxx, dag@xxxxxxxxxxx, torvalds@xxxxxxxxxxxxx, marcelo@xxxxxxxxxxxxxxxx, netdev@xxxxxxxxxxx
In-reply-to: <20011210.231826.55509210.davem@xxxxxxxxxx> from "David S. Miller" at Dec 10, 2001 11:18:26 PM
Organization: ChyGywn Limited
Reply-to: Steve Whitehouse <Steve@xxxxxxxxxxx>
Sender: owner-netdev@xxxxxxxxxxx
Hi,

> 
> These things aren't like inodes.  Inodes are cached and lookup
> read-multiple objects, whereas sockets are throw-away and recycled
> objects.  Inode allocation performance therefore isn't that critical,
> but socket allocation performance is.
> 
We do have to allocate an inode as well though (at least in the normal
case) and with the new (if I've understood the conclusions of the
discussions on inode allocation) scheme this would give us a single
allocation which would be sizeof(struct inode) + sizeof(struct socket)
in a slab cache private to sockfs.

[snip]
> 
> You know, actually, the protocols are the ones which call sk_alloc().
> So we could just extend sk_alloc() to take a kmem_cache_t argument.
> TCP could thus make a kmem_cache_t which is sizeof(struct sock)
> + sizeof(struct tcp_opt) and then set the TP_INFO pointer to "(sk +
> 1)".
> 
> Oh yes, another overhead is all the extra dereferencing.  To fight
> that we could make a macro that knows the above layout:
> 
> #define TCP_PINFO(SK) ((struct tcp_opt *)((SK) + 1))
> 
> So I guess we could do things your way without any of the potential
> performance problems.
> 
This sounds like a good plan, but let me just throw in some wild ideas
based on my earlier comments about about the struct inode...

I wonder if we could allocate a combined object along the lines of the
following:

  a) struct inode
  b) struct socket
  c) struct sock (minus per protocol areas)
  d) struct my_protocol (whatever the socket actually is)
  e) anything else required

all in one go. It seems the reasons for not doing that are:

 1. Parts (a) and (b) are likely to have a different lifetime to (c) to (e)
    but with the broken-out inode union, I wonder if thats actually likely
    to give us larger overhead or not ?

 2. Need a way to prevent inodes disappearing when the last close from
    user space occurs (while socket shutdown occurs). Perhaps we can simply 
    increment the reference count in the release routine ? I need to look into 
    that to be sure.

 3. Need to have some way of setting different sizes of structure for
    different protocols, so this looks like either having sockfs broken
    into one fs per protocol or adding some method of choosing different
    allocators. I'm not too sure that either of those solutions is "good".

I'm not sure that any of these is a big hurdle to overcome, but I have no
real feel for whether that would give us an advantage in both memory
usage efficiency and speed without actually doing it and comparing
results. It does seem that if we are going down this road, then we should
consider taking it to its ultimate end of one allocation per socket
creation though,


Steve.


<Prev in Thread] Current Thread [Next in Thread>