Re: [PATCH net-next] tuntap: introduce tx skb ring

From: Jason Wang
Date: Wed May 18 2016 - 06:23:58 EST




On 2016å05æ18æ 16:13, Jesper Dangaard Brouer wrote:
On Mon, 16 May 2016 15:51:48 +0800
Jason Wang <jasowang@xxxxxxxxxx> wrote:

On 2016å05æ16æ 11:56, Eric Dumazet wrote:
On Mon, 2016-05-16 at 09:17 +0800, Jason Wang wrote:
We used to queue tx packets in sk_receive_queue, this is less
efficient since it requires spinlocks to synchronize between producer
and consumer.
...
struct tun_struct *detached;
+ /* reader lock */
+ spinlock_t rlock;
+ unsigned long tail;
+ struct tun_desc tx_descs[TUN_RING_SIZE];
+ /* writer lock */
+ spinlock_t wlock;
+ unsigned long head;
};
Ok, we had these kind of ideas floating around for many other cases,
like qdisc, UDP or af_packet sockets...

I believe we should have a common set of helpers, not hidden in
drivers/net/tun.c but in net/core/skb_ring.c or something, with more
flexibility (like the number of slots)
Yes, this sounds good.
I agree. It is sad to see everybody is implementing the same thing,
open coding an array/circular based ring buffer. This kind of code is
hard to maintain and get right with barriers etc. We can achieve the
same performance with a generic implementation, by inlining the help
function calls.

I implemented an array based Lock-Free/cmpxchg based queue, that you
could be inspired by, see:
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/include/linux/alf_queue.h

This looks really interesting, thanks.


The main idea behind my implementation is bulking, to amortize the
locked cmpxchg operation. You might not need it now, but I expect we
need it in the future.

Right, we need change APIs which can read or write multiple buffers at one time for tun (and for others). I agree this will be a good optimization in the future.


You cannot use my alf_queue directly as your "struct tun_desc" is
larger than one-pointer (which the alf_queue works with). But it
should be possible to extend to handle larger "objects".

Yes, and for more generic usage, maybe one more void * is sufficient.



Maybe Steven Rostedt have an even better ring queue implementation
already avail in the kernel?


You mean ring buffer in tracing? Not sure, but it looks rather complex at first glance.