Re: big picture UDP/IP performance question re 2.6.18 -> 2.6.32
From: Eric Dumazet
Date: Fri Oct 07 2011 - 01:40:21 EST
Le jeudi 06 octobre 2011 Ã 23:27 -0400, starlight@xxxxxxxxxxx a Ãcrit :
> After writing the last post, the large
> difference in IRQ rate between the older
> and newer kernels caught my eye.
> I wonder if the hugely lower rate in the older
> kernels reflects a more agile shifting
> into and out of NAPI mode by the network
> In this test the sending system
> pulses data out on millisecond boundaries
> due to the behavior of nsleep(), which
> is used to establish the playback pace.
> If the older kernels are switching to NAPI
> for much of surge and the switching out
> once the pulse falls off, it might
> conceivably result in much better latency
> and overall performance.
> All tests were run with Intel 82571
> network interfaces and the 'e1000e'
> device driver. Some used the driver
> packaged with the kernel, some used
> Intel driver compiled from the source
> found on sourceforge.net. Never could
> detected any difference between the two.
> Since data in the production environment
> also tends to arrive in bursts, I don't find
> the pulsing playback behavior a detriment.
Thats exactly the opposite : Your old kernel is not fast enough to
enter/exit NAPI on every incoming frame.
Instead of one IRQ per incoming frame, you have less interrupts :
A napi run processes more than 1 frame.
Now increase your incoming rate, and you'll discover a new kernel will
be able to process more frames without losses.
About your thread model :
You have one thread that reads the incoming frame, and do a distribution
on several queues based on some flow parameters. Then you wakeup a
This kind of model is very expensive and triggers lot of false sharing.
New kernels are able to perform this fanout in kernel land.
You really should take a look at Documentation/networking/scaling.txt
[ An other way of doing this fanout is using some iptables rules :
check following commit changelog for an idea ]
Author: Eric Dumazet <eric.dumazet@xxxxxxxxx>
Date: Fri Jul 23 12:59:36 2010 +0200
netfilter: add xt_cpu match
In some situations a CPU match permits a better spreading of
connections, or select targets only for a given cpu.
With Remote Packet Steering or multiqueue NIC and appropriate IRQ
affinities, we can distribute trafic on available cpus, per session.
(all RX packets for a given flow is handled by a given cpu)
Some legacy applications being not SMP friendly, one way to scale a
server is to run multiple copies of them.
Instead of randomly choosing an instance, we can use the cpu number as a
key so that softirq handler for a whole instance is running on a single
cpu, maximizing cache effects in TCP/UDP stacks.
Using NAT for example, a four ways machine might run four copies of
server application, using a separate listening port for each instance,
but still presenting an unique external port :
iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \
-j REDIRECT --to-port 8080
iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \
-j REDIRECT --to-port 8081
iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \
-j REDIRECT --to-port 8082
iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \
-j REDIRECT --to-port 8083
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/