Re: [NET] 64 bit byte counter for 2.6.3

From: Stephen Hemminger
Date: Wed Feb 18 2004 - 18:03:52 EST


On Wed, 18 Feb 2004 17:45:16 -0500 (EST)
"Richard B. Johnson" <root@xxxxxxxxxxxxxxxxxx> wrote:

> On Wed, 18 Feb 2004, Markus [ISO-8859-1] Hästbacka wrote:
>
> > On Wed, 2004-02-18 at 22:32, Richard B. Johnson wrote:
> > > Manipulation of a 'long long' is not atomic in 32 bit architectures.
> > > Please explain how we don't care, if we shouldn't care. Also some
> > > /proc entries might get read incorrectly with existing tools.
> > Please, tell me all the tools, I'll test them. ifconfig and netstat
> > works correctly atleast.
> >
>
> Hum, they will work when they see 2^64 written in ASCII in /proc/net/dev ?
> I doubt that. Have you ever even seen 64-bit ASCII?
>
> The largest unsigned long long will be 18446744073709551615.
> If it's read with 32-bit tools (sscanf), it will read 4294967295.
>
> > And about the caring, is rx/tx bytes so important that they can't use
> > long long? I would care to see more than 4GB, and maybe some error in
> > the counter at some point (Have you _ever_ seen that happen?) than only
> > 4GB.
> >
>
> The 32-bit wrap is a serious problem. It can't be ignored. The writing
> of any multiple-part variable in the kernel can be interrupted at
> any time. Interrupting half-written variables can have dire
> consequences. Let's pretend that gcc knows how to write a long-long
> with minimum overhead..... high word is in edx and the low word is
> in eax. The memory variable is addressed by ebx....
>
> addl %eax, (%ebx) # Sum low word
> adcl %edx, 0x04(%ebx) # Sum CY and high
>
> Now, get interrupted...
>
> addl %eax, (%ebx) # Sum low word
> ->>> interrupt <<<--
>
> Do some code that adds more stuff to
> the variable....
>
> Return to the interrupted code.
>
> adcl %edx, 0x04(%ebx) # Sum CY and high
>
> The memory variable is now wrong. If it's an event counter, maybe
> it doesn't make any difference. That's what needs to be addressed.
> You can't just dismiss it. Any time you write code that knowingly
> results in the wrong results, its impact needs to be fully understood.
>
> Changing a bunch of variables in a 32-bit machine to 64-bit ones
> is a major change, regardless of how trivial it may seem.
>
> You need to make a spin-locked thingy for every variable you
> want to manipulate if the result needs to be correct.
>
> > And no, I didn't post this to be merged into the mainline kernel, just
> > to let users know that there maybe is an option for seeing maximum 4GB.
> >
> > This has been working for me since, umm.. let's say 2.4.20. All the
> > tools I've needed have worked.
>
> Just wait until your packet count is 4294967296. It will likely read 0.
> When 4294967297, may read 1. That's off by quite a bit.
>
> >
> > Markus
> >
>
> Cheers,
> Dick Johnson
> Penguin : Linux version 2.4.24 on an Intel Pentium III machine (797.90 BogoMips).
> Note 96.31% of all statistics are fiction.
>

Do it right:
* use per-cpu long long for both bytes and packet counts
and change each driver ...
* expose both a new 64 bit and legacy 32 bit /proc interface
* no tools use /sys yet, so that needs to show long long
* have both a get_stats and get_stats64 hook in netdevice so not all drivers
have to be converted at once.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/