netdev
[Top] [All Lists]

Re: [PATCH]snmp6 64-bit counter support in proc.c

To: kuznet@xxxxxxxxxxxxx
Subject: Re: [PATCH]snmp6 64-bit counter support in proc.c
From: Krishna Kumar <kumarkr@xxxxxxxxxx>
Date: Thu, 22 Jan 2004 13:18:49 -0800
Cc: davem@xxxxxxxxxx (David S. Miller), kuznet@xxxxxxxxxxxxx, mashirle@xxxxxxxxxx, netdev@xxxxxxxxxxx, netdev-bounce@xxxxxxxxxxx, Shirley Ma <xma@xxxxxxxxxx> (Shirley Ma)
Sender: netdev-bounce@xxxxxxxxxxx

Alexei,

That is a good point you raised, we don't want to read the counter while the writer
might change and overflow of one word can result in a really corrupt value.

If 64bit counters is a good idea to implement, what I find OK to do is to penalize the
readers (proc filesystem interface or netlink) but make sure that the writers don't get
penalized by being forced to serialize, in effect writers must run as fast as if there were
no other readers. I could think of a hack to make that happen (hopefully not too ugly :-) :

#if 64_bit_system
the old code is OK here.
#else
__u64 get_sync_data(void *mib[], int nr)
{
__u64 res1, res2;
__u64 res3;

res1 = *((__u64 *) (((void *) per_cpu_ptr(mib[0], i)) + sizeof (__u64) * nr)));
synchronize_kernel();
res2 = *((__u64 *) (((void *) per_cpu_ptr(mib[0], i)) + sizeof (__u64) * nr)));
if (res2 < res1) {
/ * Overflow, sync and re-read, the next read is guaranteed to be greater */
synchronize_kernel();
res2 = *((__u64 *) (((void *) per_cpu_ptr(mib[0], i)) + sizeof (__u64) * nr)));
}

/* similar code for mib[1], add both into res3

return res3;
}
#endif

static __u64
fold_field(void *mib[], int nr)
{
...
res += get_sync_data(mib, nr);
...
}

The value can reduce only once every 4gig increments, which means that reading res2 after the first
sync_kernel will be less than res1 very rarely (once a few days on a fast ethernet card). In case res2
is less than res1, doing another sync_kernel and rereading of res2 is guaranteed to return a value
greater than res1 because another 4Gig iterations of increments couldn't happen in the time for one
context switch of all cpus. The sync_kernel is needed so that we don't read the value faster than the
writer is updating the two words.

Does that sound realistic for implementing 64 bit counters ? Or do you have better or simpler
suggestions ?

Thanks,

- KK

Inactive hide details for kuznet@xxxxxxxxxxxxxkuznet@xxxxxxxxxxxxx




          kuznet@xxxxxxxxxxxxx
          Sent by: netdev-bounce@xxxxxxxxxxx

          01/22/2004 10:26 AM



To: Shirley Ma/Beaverton/IBM@IBMUS
cc: davem@xxxxxxxxxx (David S. Miller), mashirle@xxxxxxxxxxxxxxxxxxxxxxx, kuznet@xxxxxxxxxxxxx, netdev@xxxxxxxxxxx
Subject: Re: [PATCH]snmp6 64-bit counter support in proc.c


Hello!

> Did you hear different voices?

Here is a little warning. It will give corrupt values on 32 bit archs
when update with 32 bit overflow happens while value is folded.

To do 64 bit arithmetics you need either to serialize reader wrt writer
or to do some funny tricks with detecting overflows while reading and
special sequence of operations at update with proper barriers, which
will be reflected in performance anyway. Essentially, this haemorhoids
is the reason why they stayed 32 bit.

Alexey

GIF image

<Prev in Thread] Current Thread [Next in Thread>