From: "Michael S. Tsirkin" <mst@xxxxxxxxxx>
Date: Wed, 1 Jun 2016 15:54:34 +0300
This is in response to the proposal by Jason to make tun...
rx packet queue lockless using a circular buffer.
My testing seems to show that at least for the common usecase
in networking, which isn't lockless, circular buffer
with indices does not perform that well, because
each index access causes a cache line to bounce between
CPUs, and index access causes stalls due to the dependency.
By comparison, an array of pointers where NULL means invalid
and !NULL means valid, can be updated without messing up barriers
at all and does not have this issue.
On the flip side, cache pressure may be caused by using large queues.
tun has a queue of 1000 entries by default and that's 8K.
At this point I'm not sure this can be solved efficiently.
The correct solution might be sizing the queues appropriately.
Here's an implementation of this idea: it can be used more
or less whenever sk_buff_head can be used, except you need
to know the queue size in advance.
I have no fundamental issues with this piece of infrastructure, but when
it gets included I want this series to include at least one use case.
This can be an adaptation of Jason's tun rx packet queue changes, or
similar.
Thanks.