Re: [ovs-dev] [PATCH net-next v8] net: openvswitch: IPv6: Add IPv6 extension header support

From: Roi Dayan
Date: Tue Mar 08 2022 - 09:39:16 EST




On 2022-03-08 4:12 PM, Ilya Maximets wrote:
On 3/8/22 09:21, Johannes Berg wrote:
On Mon, 2022-03-07 at 21:45 -0800, Jakub Kicinski wrote:

Let me add some people I associate with genetlink work in my head
(fairly or not) to keep me fair here.

:)

It's highly unacceptable for user space to straight up rewrite kernel
uAPI types


Agree.

I 100% agree with that and will work on the userspace part to make sure
we're not adding anything to the kernel uAPI types.

FWIW, the quick grep over usespace code shows similar problem with a few
other types, but they are less severe, because they are provided as part
of OVS actions and kernel doesn't send anything that wasn't previously
set by userspace in that case. There still might be a problem during the
downgrade of the userspace while kernel configuration remains intact,
but that is not a common scenario. Will work on fixing that in userspace.
No need to change the kernel uAPI for these, IMO.


since its rc7 we end up with kernel and ovs broken with each other.
can we revert the kernel patches anyway and introduce them again later
when ovs userspace is also updated?


but if it already happened the only fix is something like:

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 9d1710f20505..ab6755621e02 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -351,11 +351,16 @@ enum ovs_key_attr {
OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4, /* struct ovs_key_ct_tuple_ipv4 */
OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6, /* struct ovs_key_ct_tuple_ipv6 */
OVS_KEY_ATTR_NSH, /* Nested set of ovs_nsh_key_* */
- OVS_KEY_ATTR_IPV6_EXTHDRS, /* struct ovs_key_ipv6_exthdr */
#ifdef __KERNEL__
OVS_KEY_ATTR_TUNNEL_INFO, /* struct ip_tunnel_info */
#endif
+ /* User space decided to squat on types 30 and 31 */
+ OVS_KEY_ATTR_IPV6_EXTHDRS = 32, /* struct ovs_key_ipv6_exthdr */
+ /* WARNING: <scary warning to avoid the problem coming back> */

Yes, that is something that I had in mind too. The only thing that makes
me uncomfortable is OVS_KEY_ATTR_TUNNEL_INFO = 30 here. Even though it
doesn't make a lot of difference, I'd better keep the kernel-only attributes
at the end of the enumeration. Is there a better way to handle kernel-only
attribute?

Also, the OVS_KEY_ATTR_ND_EXTENSIONS (31) attribute used to store IPv6 Neighbor
Discovery extensions is currently implemented only for userspace, but nothing
actually prevents us having the kernel implementation. So, we need a way to
make it usable by the kernel in the future.


It might be nicer to actually document here in what's at least supposed
to be the canonical documentation of the API what those types were used
for.

I agree with that.

Note that with strict validation at least they're rejected by the
kernel, but of course I have no idea what kind of contortions userspace
does to make it even think about defining its own types (netlink
normally sits at the kernel/userspace boundary, so where does it make
sense for userspace to have its own types?)

(Though note that technically netlink supports userspace<->userspace
communication, but that's not used much)

OVS has a common high-level interface+logic and several different
implementations of a "datapath". One of datapaths is inside the Linux
kernel which we're discussing here, another is completely in userspace
(to make use of DPDK or AF_XDP), there is also an implementation for the
Windows kernel. Since the way to talk with the Linux kernel is netlink,
OVS is using netlink-based communication to communicate between high-level
parts and all types of datapaths. Some features might be supported by
one datapath and not supported by others, hence some way to extend the
communication is needed. E.g. kernel currently doesn't parse ND extensions,
but userspace datapath does.

But yes, the current implementation is awful and OVS need to have a
different way of managing datapath-specific attributes and not touch
kernel-defined types. We'll work on that.


Since ovs uses genetlink you should be able to dump the policy from
the kernel and at least validate that it doesn't overlap.

That is interesting. Indeed, this functionality can be used to detect
problems or to define userspace-only attributes in runtime based on the
kernel reply. Thanks for the pointer!

As you note, you'd have to do that at runtime since it can change, even
the policy. And things not in the policy probably should never be sent
to the kernel even if strict validation isn't used.

Agree. AFAICT, OVS currently doesn't send to the kernel things that kernel
doesn't support.


johannes