Re: [PATCH] net: tuntap: add ioctl() TUNGETQUEUEINDX to fetch queue index

From: Randy Li
Date: Thu Aug 01 2024 - 09:36:52 EST



On 2024/8/1 21:04, Willem de Bruijn wrote:
Randy Li wrote:
On 2024/8/1 05:57, Willem de Bruijn wrote:
nits:

- INDX->INDEX. It's correct in the code
- prefix networking patches with the target tree: PATCH net-next
I see.
Randy Li wrote:
On 2024/7/31 22:12, Willem de Bruijn wrote:
Randy Li wrote:
We need the queue index in qdisc mapping rule. There is no way to
fetch that.
In which command exactly?
That is for sch_multiq, here is an example

tc qdisc add dev  tun0 root handle 1: multiq

tc filter add dev tun0 parent 1: protocol ip prio 1 u32 match ip dst
172.16.10.1 action skbedit queue_mapping 0
tc filter add dev tun0 parent 1: protocol ip prio 1 u32 match ip dst
172.16.10.20 action skbedit queue_mapping 1

tc filter add dev tun0 parent 1: protocol ip prio 1 u32 match ip dst
172.16.10.10 action skbedit queue_mapping 2
If using an IFF_MULTI_QUEUE tun device, packets are automatically
load balanced across the multiple queues, in tun_select_queue.

If you want more explicit queue selection than by rxhash, tun
supports TUNSETSTEERINGEBPF.
I know this eBPF thing. But I am newbie to eBPF as well I didn't figure
out how to config eBPF dynamically.
Lack of experience with an existing interface is insufficient reason
to introduce another interface, of course.

tc(8) was old interfaces but doesn't have the sufficient info here to complete its work.

I think eBPF didn't work in all the platforms? JIT doesn't sound like a good solution for embeded platform.

Some VPS providers doesn't offer new enough kernel supporting eBPF is another problem here, it is far more easy that just patching an old kernel with this.

Anyway, I would learn into it while I would still send out the v2 of this patch. I would figure out whether eBPF could solve all the problem here.

Besides, I think I still need to know which queue is the target in eBPF.
See SKF_AD_QUEUE for classic BPF and __sk_buff queue_mapping for eBPF.
I would look into it. Wish I don't need the patch that keeps the queue index unchanged.
The purpose here is taking advantage of the multiple threads. For the
the server side(gateway of the tunnel's subnet), usually a different
peer would invoked a different encryption/decryption key pair, it would
be better to handle each in its own thread. Or the application would
need to implement a dispatcher here.
A thread in which context? Or do you mean queue?
The thread in the userspace. Each thread responds for a queue.
I am newbie to the tc(8), I verified the command above with a tun type
multiple threads demo. But I don't know how to drop the unwanted ingress
filter here, the queue 0 may be a little broken.
Not opposed to exposing the queue index if there is a need. Not sure
yet that there is.

Also, since for an IFF_MULTI_QUEUE the queue_id is just assigned
iteratively, it can also be inferred without an explicit call.
I don't think there would be sequence lock in creating multiple queue.
Unless application uses an explicitly lock itself.

While that did makes a problem when a queue would be disabled. It would
swap the last queue index with that queue, leading to fetch the queue
index calling again, also it would request an update for the qdisc flow
rule.

Could I submit a ***new*** PATCH which would peak a hole, also it
applies for re-enabling the queue.

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 1d06c560c5e6..5473a0fca2e1 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -3115,6 +3115,10 @@ static long __tun_chr_ioctl(struct file *file,
unsigned int cmd,
          if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
              return -EPERM;
          return open_related_ns(&net->ns, get_net_ns);
+    } else if (cmd == TUNGETQUEUEINDEX) {
+        if (tfile->detached)
+            return -EINVAL;
+        return put_user(tfile->queue_index, (unsigned int __user*)argp);
Unless you're certain that these fields can be read without RTNL, move
below rtnl_lock() statement.
Would fix in v2.
I was trying to not hold the global lock or long period, that is why I
didn't made v2 yesterday.

When I wrote this,  I saw ioctl() TUNSETQUEUE->tun_attach() above. Is
the rtnl_lock() scope the lighting lock here?