Re: [PATCH net-next 2/2] net: exit busy loop when another process is runnable

From: Eliezer Tamir
Date: Tue Sep 02 2014 - 02:04:43 EST

On 02/09/2014 06:35, Jason Wang wrote:
> On 09/01/2014 02:55 PM, Eliezer Tamir wrote:
>> On 26/08/2014 10:16, Jason Wang wrote:
>>> On 08/25/2014 09:16 PM, Eliezer Tamir wrote:

>> Think about the case where two processes are busy polling on the
>> same CPU and the same device queue. Since busy polling processes
>> incoming packets on the queue from any process, this scenario works
>> well currently,
> I see, but looks like we can simply do this by exiting the busy loop
> when ndo_busy_poll() finds something but not for current socket?

I don't think there is a need for that.

When ndo_busy_poll() finds something it feeds it to the stack, which
will process the packet, just as if it came from NAPI polling.
So, if this is data that someone is blocked waiting on, the stack will
wake them up, and then you presumably can decide which app should get
the cpu.

Note, that there is no easy way to know, when looking at the
incoming traffic, whether it is important, or even if you are seeing
a full message. (Maybe you only have 9 packets out of 10?)
The only place this knowledge might exist is in the layers of the
stack closer to the user.

>> and will not work at all when polling yields to other
>> processes that are of the same priority that are running on the same
>> CPU.
>> Maybe the networking subsystem should maintain a list of device
>> queues that need busypolling and have a thread that would poll
>> all of them when there's nothing better to do.
> Not sure whether this method will scale considering thousands of sockets
> and processes.

There may be millions of sockets, but in most cases only a handful of
device queues per CPU to busy poll on. I have tested the epoll rfc
code with hundreds of thousands of sockets and one or two device
queues and is scales pretty well.

The part I don't like in that code is the cumbersome mechanism I used
to track the socket -> queue relationship. I think that if I had more
time to work on it, I would instead look into extending the epoll
interface so that libevent can tell the kernel what it wants, instead
of having the busypoll code try and learn it.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at