Re: [Bug #11308] tbench regression on each kernel release from 2.6.22-> 2.6.28

From: Eric Dumazet
Date: Mon Nov 17 2008 - 11:35:32 EST


Ingo Molnar a écrit :
* Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:

It all looks like pure old-fashioned straight overhead in the networking layer to me. Do we still touch the same global cacheline for every localhost packet we process? Anything like that would show up big time.
Yes we do, I find strange we dont see dst_release() in your NMI profile

I posted a patch ( commit 5635c10d976716ef47ae441998aeae144c7e7387 net: make sure struct dst_entry refcount is aligned on 64 bytes) (in net-next-2.6 tree) to properly align struct dst_entry refcounter and got 4% speedup on tbench on my machine.

Ouch, +4% from a oneliner networking change? That's a _huge_ speedup compared to the things we were after in scheduler land. A lot of scheduler folks worked hard to squeeze the last 1-2% out of the scheduler fastpath (which was not trivial at all). The _full_ scheduler accounts for only about 7% of the total system overhead here on a 16-way box...

4% on my machine, but apparently my machine is sooooo special (see oprofile thread),
so maybe its cpus have a hard time playing with a contended cache line.

It definitly needs more testing on other machines.

Maybe you'll discover patch is bad on your machines, this is why it's in
net-next-2.6


So why should we be handling this anything but a plain networking performance regression/weakness? The localhost scalability bottleneck has been reported a _long_ time ago.


struct dst_entry problem was already discovered a _long_ time ago
and probably solved at this time.

(commit f1dd9c379cac7d5a76259e7dffcd5f8edc697d17
Thu, 13 Mar 2008 05:52:37 +0000 (22:52 -0700)
[NET]: Fix tbench regression in 2.6.25-rc1)

Then, a gremlin came and broke the thing.

They are many contended cache lines in the system, we can do our
best to try to make them disappear. Thats not always possible.

Another contended cache line is the rwlock in iptables.
I remember Stephen had a patch to make the thing use RCU.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/