Re: [PATCH 4/5] netdev: implement infrastructure for threadable napi irq

From: Eric Dumazet
Date: Thu Jun 16 2016 - 12:55:35 EST


>
> I guess you means 'consumer' here. The scheduler doesn't fail to migrate
> it: the consumer is actually migrated a lot of times, but on each cpu a
> competing and running ksoftirqd thread is found.
>
> The general problem is that under significant network load (not
> necessary udp flood, similar behavior is observed even with TCP_RR
> tests), with enough rx queue available and enough flows running, no
> single thread/process can use 100% of any cpu, even if the overall
> capacity would allow it.
>

Looks like a general process scheduler issue ?

Really, allowing the RX processing to be migrated among cpus is
problematic for TCP,
as it will increase reorders.

RFS for example has a very specific logic to avoid these problems as
much as possible.

/*
* If the desired CPU (where last recvmsg was done) is
* different from current CPU (one in the rx-queue flow
* table entry), switch if one of the following holds:
* - Current CPU is unset (>= nr_cpu_ids).
* - Current CPU is offline.
* - The current CPU's queue tail has advanced beyond the
* last packet that was enqueued using this table entry.
* This guarantees that all previous packets for the flow
* have been dequeued, thus preserving in order delivery.
*/
if (unlikely(tcpu != next_cpu) &&
(tcpu >= nr_cpu_ids || !cpu_online(tcpu) ||
((int)(per_cpu(softnet_data, tcpu).input_queue_head -
rflow->last_qtail)) >= 0)) {
tcpu = next_cpu;
rflow = set_rps_cpu(dev, skb, rflow, next_cpu);
}