Re: big picture UDP/IP performance question re 2.6.18 -> 2.6.32

From: starlight
Date: Sun Oct 02 2011 - 11:33:09 EST

At 09:21 AM 10/2/2011 +0200, Eric Dumazet wrote:
>Le dimanche 02 octobre 2011 Ã 01:33 -0400,
>You might try to disable any fancy power saving
>mode in your machine. Maybe on your machine, cost
>to enter/exit deep sleep state is too high.

I'll check this out. It's an Opteron 6174 and
so recent and with all the aggressive power
saving AMD can dish out. Note that 'cpuspeed'
is turned off. Only thing I can think of
off the top of my head is to boot with
'nohalt' (which used to be idle=poll). If
anyone knows further Magney Cours tweaks
please let me know.

>Just to clarify a bit :
>Sometimes, optimizing one part of the kernel can
>have a negative impact on some workloads because
>we end up doing more sleep/wakeup of consumers :
>Several threads might try to acquire a lock at the
>same time, while previously they got no
>In 2.6.35, commit c377411f2494a (net:
>sk_add_backlog() take rmem_alloc into account)
>changed backlog limit, avoid taking socket lock on
>flood, allowing to receive 200.000 pps on a test
>machine instead of 100pps. But the receiver was
>doing a plain
>while (1)
> recv(...);
>And maximum throughput was reached because task
>never called scheduler...

Sometimes I think it might be nice to run without
an OS at all :-)

We are looking at various kernel bypass approaches
for receiving packets and it's in my mind that
one of these may get us to running on newer
kernels without giving up performance.

>I see nothing obvious in the profile but userland
>processing, futex calls.

Indeed, this is from the app-internal queuing
I mentioned in the last post. Because UDP is
presently unusable in any .32 or .39 kernel
I'm stuck with using packet sockets as a proxy
for them. As mentioned, the two produce very
similar results in all past tests.

>Network processing seems to account less than 10%
>of total cpu... All this sounds more a process
>scheduling regression than a network stack one..

Interesting. When I start testing kernel bypass
the degree to which this is the case should become

>On new kernels, you can check if your udp sockets
>drops frames because of rcvbuffer being full (cat
>/proc/net/udp, check last column 'drops')

As I've stated several times, all these tests are
run with verifiable zero data loss. The max-load
test is the highest rate were zero data loss can
be maintained. UDP socket buffers are set to 64MB.

>To check if softirq processing hit some limits :
>cat /proc/net/softnet_stat


>Please send full "dmesg" output


At 07:47 AM 10/2/2011 -0700, Stephen Hemminger wrote:
>Try disabling PCI DMA remapping. The additional
>overhead of setting up IO mapping can be a
>performance buzz kill. Try CONGIG_DMAR=n

Definitely. Thanks for the tip!


Probably won't get to these until Monday or

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at