Re: [RFC PATCH 5/7] tun: Introduce virtio-net hashing feature

From: Akihiko Odaki
Date: Mon Oct 09 2023 - 06:11:41 EST


On 2023/10/09 19:07, Willem de Bruijn wrote:
On Mon, Oct 9, 2023 at 3:05 AM Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote:



On 2023/10/09 18:54, Willem de Bruijn wrote:
On Mon, Oct 9, 2023 at 3:44 AM Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote:

On 2023/10/09 17:13, Willem de Bruijn wrote:
On Sun, Oct 8, 2023 at 12:22 AM Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote:

virtio-net have two usage of hashes: one is RSS and another is hash
reporting. Conventionally the hash calculation was done by the VMM.
However, computing the hash after the queue was chosen defeats the
purpose of RSS.

Another approach is to use eBPF steering program. This approach has
another downside: it cannot report the calculated hash due to the
restrictive nature of eBPF.

Introduce the code to compute hashes to the kernel in order to overcome
thse challenges. An alternative solution is to extend the eBPF steering
program so that it will be able to report to the userspace, but it makes
little sense to allow to implement different hashing algorithms with
eBPF since the hash value reported by virtio-net is strictly defined by
the specification.

The hash value already stored in sk_buff is not used and computed
independently since it may have been computed in a way not conformant
with the specification.

Signed-off-by: Akihiko Odaki <akihiko.odaki@xxxxxxxxxx>

@@ -2116,31 +2172,49 @@ static ssize_t tun_put_user(struct tun_struct *tun,
}

if (vnet_hdr_sz) {
- struct virtio_net_hdr gso;
+ union {
+ struct virtio_net_hdr hdr;
+ struct virtio_net_hdr_v1_hash v1_hash_hdr;
+ } hdr;
+ int ret;

if (iov_iter_count(iter) < vnet_hdr_sz)
return -EINVAL;

- if (virtio_net_hdr_from_skb(skb, &gso,
- tun_is_little_endian(tun), true,
- vlan_hlen)) {
+ if ((READ_ONCE(tun->vnet_hash.flags) & TUN_VNET_HASH_REPORT) &&
+ vnet_hdr_sz >= sizeof(hdr.v1_hash_hdr) &&
+ skb->tun_vnet_hash) {

Isn't vnet_hdr_sz guaranteed to be >= hdr.v1_hash_hdr, by virtue of
the set hash ioctl failing otherwise?

Such checks should be limited to control path where possible

There is a potential race since tun->vnet_hash.flags and vnet_hdr_sz are
not read at once.

It should not be possible to downgrade the hdr_sz once v1 is selected.

I see nothing that prevents shrinking the header size.

tun->vnet_hash.flags is read after vnet_hdr_sz so the race can happen
even for the case the header size grows though this can be fixed by
reordering the two reads.

One option is to fail any control path that tries to re-negotiate
header size once this hash option is enabled?

There is no practical reason to allow feature re-negotiation at any
arbitrary time.

I think it's a bit awkward interface design since tun allows to reconfigure any of its parameters, but it's certainly possible.