Re: [PATCH 1/3] tuntap: rx batching

From: Jason Wang
Date: Thu Nov 10 2016 - 23:29:02 EST

Next message: Lu Baolu: "Re: [PATCH v4 1/4] usb: dbc: early driver for xhci debug capability"
Previous message: Nayna: "Re: [tpmdd-devel] [PATCH RFC 2/2] tpm: refactor tpm2_get_tpm_pt to tpm2_getcap_cmd"
In reply to: John Fastabend: "Re: [PATCH 1/3] tuntap: rx batching"
Next in thread: John Fastabend: "Re: [PATCH 1/3] tuntap: rx batching"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2016å11æ11æ 12:17, John Fastabend wrote:

On 16-11-10 07:31 PM, Michael S. Tsirkin wrote:

>On Fri, Nov 11, 2016 at 10:07:44AM +0800, Jason Wang wrote:

>>
>>
>>On 2016å11æ10æ 00:38, Michael S. Tsirkin wrote:

>>>On Wed, Nov 09, 2016 at 03:38:31PM +0800, Jason Wang wrote:

>>>>Backlog were used for tuntap rx, but it can only process 1 packet at
>>>>one time since it was scheduled during sendmsg() synchronously in
>>>>process context. This lead bad cache utilization so this patch tries
>>>>to do some batching before call rx NAPI. This is done through:
>>>>
>>>>- accept MSG_MORE as a hint from sendmsg() caller, if it was set,
>>>> batch the packet temporarily in a linked list and submit them all
>>>> once MSG_MORE were cleared.
>>>>- implement a tuntap specific NAPI handler for processing this kind of
>>>> possible batching. (This could be done by extending backlog to
>>>> support skb like, but using a tun specific one looks cleaner and
>>>> easier for future extension).
>>>>
>>>>Signed-off-by: Jason Wang<jasowang@xxxxxxxxxx>

>>>So why do we need an extra queue?

>>
>>The idea was borrowed from backlog to allow some kind of bulking and avoid
>>spinlock on each dequeuing.
>>

>>> This is not what hardware devices do.
>>>How about adding the packet to queue unconditionally, deferring
>>>signalling until we get sendmsg without MSG_MORE?

>>
>>Then you need touch spinlock when dequeuing each packet.

>

Random thought, I have a cmpxchg ring I am using for the qdisc work that
could possibly replace the spinlock implementation. I haven't figured
out the resizing API yet because I did not need it but I assume it could
help here and let you dequeue multiple skbs in one operation.

I can post the latest version if useful or an older version is
somewhere on patchworks as well.

.John

Look useful here, and I can compare the performance if you post.

A question is can we extend the skb_array to support that?

Thanks

Next message: Lu Baolu: "Re: [PATCH v4 1/4] usb: dbc: early driver for xhci debug capability"
Previous message: Nayna: "Re: [tpmdd-devel] [PATCH RFC 2/2] tpm: refactor tpm2_get_tpm_pt to tpm2_getcap_cmd"
In reply to: John Fastabend: "Re: [PATCH 1/3] tuntap: rx batching"
Next in thread: John Fastabend: "Re: [PATCH 1/3] tuntap: rx batching"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]