Re: [PATCH net-next v5 3/5] rtnetlink: Decouple net namespaces in rtnl_newlink_create()

From: Xiao Liang
Date: Thu Dec 12 2024 - 07:42:02 EST


On Thu, Dec 12, 2024 at 5:27 PM Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
>
> On 12/9/24 15:01, Xiao Liang wrote:
> > There are 4 net namespaces involved when creating links:
> >
> > - source netns - where the netlink socket resides,
> > - target netns - where to put the device being created,
> > - link netns - netns associated with the device (backend),
> > - peer netns - netns of peer device.
> >
> > Currently, two nets are passed to newlink() callback - "src_net"
> > parameter and "dev_net" (implicitly in net_device). They are set as
> > follows, depending on netlink attributes.
> >
> > +------------+-------------------+---------+---------+
> > | peer netns | IFLA_LINK_NETNSID | src_net | dev_net |
> > +------------+-------------------+---------+---------+
> > | | absent | source | target |
> > | absent +-------------------+---------+---------+
> > | | present | link | link |
> > +------------+-------------------+---------+---------+
> > | | absent | peer | target |
> > | present +-------------------+---------+---------+
> > | | present | peer | link |
> > +------------+-------------------+---------+---------+
> >
> > When IFLA_LINK_NETNSID is present, the device is created in link netns
> > first. This has some side effects, including extra ifindex allocation,
> > ifname validation and link notifications. There's also an extra step to
> > move the device to target netns. These could be avoided if we create it
> > in target netns at the beginning.
> >
> > On the other hand, the meaning of src_net is ambiguous. It varies
> > depending on how parameters are passed. It is the effective link or peer
> > netns by design, but some drivers ignore it and use dev_net instead.
> >
> > This patch refactors netns handling by packing newlink() parameters into
> > a struct, and passing source, link and peer netns as is through this
> > struct. Fallback logic is implemented in helper functions -
> > rtnl_newlink_link_net() and rtnl_newlink_peer_net(). If is not set, peer
> > netns falls back to link netns, and link netns falls back to source netns.
> > rtnl_newlink_create() now creates devices in target netns directly,
> > so dev_net is always target netns.
> >
> > For drivers that use dev_net as fallback of link_netns, current behavior
> > is kept for compatibility.
> >
> > Signed-off-by: Xiao Liang <shaw.leon@xxxxxxxxx>
>
> I must admit this patch is way too huge for me to allow any reasonable
> review except that this has the potential of breaking a lot of things.
>
> I think you should be splitted to make it more palatable; i.e.
> - a patch just add the params struct with no semantic changes.
> - a patch making the dev_change_net_namespace() conditional on net !=
> tge_net[1]
> - many per-device patches creating directly the device in the target
> namespace.
> - a patch reverting [1]
>
> Other may have different opinions, I'd love to hear them.

Thanks. I understand your concern. Since the device is created in common
code, how about splitting the patch this way:

1) make the params struct contain both current src_net and other netns:
struct rtnl_newlink_params {
struct net *net; // renamed from current src_net
struct net *src_net; // real src_net
struct net *link_net;
...
};
2) convert each driver to use the accurate netns,
3) remove "net", which is not used now, from params struct,
4) change rtnl_newlink_create() to create device in target netns
directly.

So 1) will be a big one but has no semantic changes.
And I will send Patch 1 in this series to the net tree instead.

>
> > diff --git a/drivers/net/amt.c b/drivers/net/amt.c
> > index 98c6205ed19f..2f7bf50e05d2 100644
> > --- a/drivers/net/amt.c
> > +++ b/drivers/net/amt.c
> > @@ -3161,14 +3161,17 @@ static int amt_validate(struct nlattr *tb[], struct nlattr *data[],
> > return 0;
> > }
> >
> > -static int amt_newlink(struct net *net, struct net_device *dev,
> > - struct nlattr *tb[], struct nlattr *data[],
> > - struct netlink_ext_ack *extack)
> > +static int amt_newlink(struct rtnl_newlink_params *params)
> > {
> > + struct net_device *dev = params->dev;
> > + struct nlattr **tb = params->tb;
> > + struct nlattr **data = params->data;
> > + struct netlink_ext_ack *extack = params->extack;
> > + struct net *link_net = rtnl_newlink_link_net(params);
> > struct amt_dev *amt = netdev_priv(dev);
> > int err = -EINVAL;
>
> Minor nit: here and and many other places, please respect the reverse
> xmas tree order.

Will fix this.