Re: [PATCH 0/3] xsk: fix for xsk_poll writeable

From: Magnus Karlsson
Date: Mon Nov 23 2020 - 09:01:05 EST


On Wed, Nov 18, 2020 at 9:25 AM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote:
>
> I tried to combine cq available and tx writeable, but I found it very difficult.
> Sometimes we pay attention to the status of "available" for both, but sometimes,
> we may only pay attention to one, such as tx writeable, because we can use the
> item of fq to write to tx. And this kind of demand may be constantly changing,
> and it may be necessary to set it every time before entering xsk_poll, so
> setsockopt is not very convenient. I feel even more that using a new event may
> be a better solution, such as EPOLLPRI, I think it can be used here, after all,
> xsk should not have OOB data ^_^.
>
> However, two other problems were discovered during the test:
>
> * The mask returned by datagram_poll always contains EPOLLOUT
> * It is not particularly reasonable to return EPOLLOUT based on tx not full
>
> After fixing these two problems, I found that when the process is awakened by
> EPOLLOUT, the process can always get the item from cq.
>
> Because the number of packets that the network card can send at a time is
> actually limited, suppose this value is "nic_num". Once the number of
> consumed items in the tx queue is greater than nic_num, this means that there
> must also be new recycled items in the cq queue from nic.
>
> In this way, as long as the tx configured by the user is larger, we won't have
> the situation that tx is already in the writeable state but cannot get the item
> from cq.

I think the overall approach of tying this into poll() instead of
setsockopt() is the right way to go. But we need a more robust
solution. Your patch #3 also breaks backwards compatibility and that
is not allowed. Could you please post some simple code example of what
it is you would like to do in user space? So you would like to wake up
when there are entries in the cq that can be retrieved and the reason
you would like to do this is that you then know you can put some more
entries into the Tx ring and they will get sent as there now are free
slots in the cq. Correct me if wrong. Would an event that wakes you up
when there is both space in the Tx ring and space in the cq work? Is
there a case in which we would like to be woken up when only the Tx
ring is non-full? Maybe there are as it might be beneficial to fill
the Tx and while doing that some entries in the cq has been completed
and away the packets go. But it would be great if you could post some
simple example code, does not need to compile or anything. Can be
pseudo code.

It would also be good to know if your goal is max throughput, max
burst size, or something else.

Thanks: Magnus


> Xuan Zhuo (3):
> xsk: replace datagram_poll by sock_poll_wait
> xsk: change the tx writeable condition
> xsk: set tx/rx the min entries
>
> include/uapi/linux/if_xdp.h | 2 ++
> net/xdp/xsk.c | 26 ++++++++++++++++++++++----
> net/xdp/xsk_queue.h | 6 ++++++
> 3 files changed, 30 insertions(+), 4 deletions(-)
>
> --
> 1.8.3.1
>