From: Eric Dumazet <dada1@xxxxxxxxxxxxx>
Date: Fri, 08 Jul 2005 08:53:02 +0200
> About making sk_buff smaller, I use this patch to declare 'struct
> sec_path *sp' only ifdef CONFIG_XFRM, what do you think ? I also
> use a patch to declare nfcache, nfctinfo and nfct only if
> CONFIG_NETFILTER_CONNTRACK or CONFIG_NETFILTER_CONNTRACK_MODULE are
> defined, but thats more intrusive. Also, tc_index is not used if
> CONFIG_NET_SCHED only is declared but none of CONFIG_NET_SCH_* In my
> case, I am using CONFIG_NET_SCHED only to be able to do : tc -s -d
> qdisc
Distributions enable all of the ifdefs, and that is thus the
size and resultant performance most users see.
That's why I'm working on shrinking the size assuming all the
config options are enabled, because that is the reality for most
installations.
For all of this stuff we could consider stealing some ideas from BSD,
namely doing something similar to their MBUF tags.
If a subsystem wants to add a cookie to a networking buffer, it
allocates a tag and links it into the struct. So, you basically get
away with only one pointer (a struct hlist_head). We could use this
for the security, netfilter, and TC stuff. I don't know exactly what
our tags would look like, but perhaps:
struct skb_tag;
struct skb_tag_type {
void (*destructor)(struct skb_tag *);
kmem_cache_t *slab_cache;
const char *name;
};
struct skb_tag {
struct hlist_node list;
struct skb_tag_type *owner;
int tag_id;
char data[0];
};
struct sk_buff {
...
struct hlist_head tag_list;
...
};
Then netfilter does stuff like:
struct sk_buff *skb;
struct skb_tag *tag;
struct conntrack_skb_info *info;
tag = skb_find_tag(skb, SKB_TAG_NETFILTER_CONNTRACK);
info = (struct conntrack_skb_info *) tag->data;
etc. etc.
The downsides to this approach are:
1) Tagging an SKB eats a memory allocation, which isn't nice.
This is mainly why I haven't mentioned this idea before.
It may be that, on an active system, the per-cpu SLAB caches
for such tag objects might keep the allocation costs real low.
Another factor is that tags are relatively tiny, so a large
number of them fit in one SLAB.
But on the other hand we've been trying to remove per-packet
kmalloc() counts, see the SKB fast-clone discussions about that.
And people ask for SKB recycling all the time.
2) skb_clone() would get more expensive. This is because you'd
need to clone the SKB tags as well.
There is the possibility to hang the tags off of the
skb_shinfo() area. I know this idea sounds crazy, but
the theory goes that if the netfilter et. al info would
change (and thus, so would the assosciative tags), then
you'd need to COW the SKB anyways.
This is actually an idea worth considering regardless of
whether we do tags or not. It would result in less reference
counting when we clone an SKB with netfilter stuff or
security stuff attached.
Overall I'm not too thrilled with the idea, but I'm enthusiatic
about being convinced otherwise since this would shrink sk_buff
dramatically. :-)
|