Re: APIC seems not working under high network load

From: Eric Dumazet
Date: Tue Jan 03 2012 - 15:30:34 EST


Le mardi 03 janvier 2012 Ã 14:28 -0500, Jia Rao a Ãcrit :
> Hi all,
>
> I had scalability issues when running memcached-like workloads in a
> Intel 8-core system. It turned out that under high network load only
> one core handled the interrupts from NIC. In the output of mpstat, one
> of the core was overloaded with 100% %soft time. I tried to echo
> different core ids to /proc/irq/irq#/smp_affinity, it did not take
> effect until I stopped the incoming network traffic. It seems that
> APIC is not distributing interrupts to different cores.
>
> The below are the details of the testing system:
> OS: Linux 3.1.4 X86_64
> CPU: Intel L5450, quad-core
> NIC: Intel e1000e

If network load is high enough, cpu handling network IRQ stay in softirq
mode (NAPI) and never re-arms interrupt, so no other cpu can handle the
load.

This is why its better to change /proc/irq/irq#/smp_affinity _before_
load starts.

With your kernel, one way to help this cpu exit from softirq is using
RPS (Receive Packet Steering), since packets will be spreaded on several
cpus. Then, later hardware IRQ might be spreaded on other cpus as well.

for n in `seq 0 7`
do
echo ff >/sys/class/net/eth0/queues/rx-$n/rps_cpus
done

[ If your eth0 has 8 queues ]



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/