RE: [PATCH net-next] netdevsim: Add max_vfs to bus_dev
From: Yuval Avnery
Date: Thu Dec 12 2019 - 00:11:21 EST
> -----Original Message-----
> From: Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx>
> Sent: Wednesday, December 11, 2019 3:50 PM
> To: Yuval Avnery <yuvalav@xxxxxxxxxxxx>
> Cc: Jiri Pirko <jiri@xxxxxxxxxxxx>; davem@xxxxxxxxxxxxx;
> netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH net-next] netdevsim: Add max_vfs to bus_dev
>
> On Wed, 11 Dec 2019 23:25:09 +0000, Yuval Avnery wrote:
> > > -----Original Message-----
> > > From: Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx>
> > > Sent: Wednesday, December 11, 2019 2:24 PM
> > > To: Yuval Avnery <yuvalav@xxxxxxxxxxxx>
> > > Cc: Jiri Pirko <jiri@xxxxxxxxxxxx>; davem@xxxxxxxxxxxxx;
> > > netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> > > Subject: Re: [PATCH net-next] netdevsim: Add max_vfs to bus_dev
> > >
> > > On Wed, 11 Dec 2019 19:57:34 +0000, Yuval Avnery wrote:
> > > > > -----Original Message-----
> > > > > From: Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx>
> > > > > Sent: Wednesday, December 11, 2019 11:16 AM
> > > > > To: Yuval Avnery <yuvalav@xxxxxxxxxxxx>
> > > > > Cc: Jiri Pirko <jiri@xxxxxxxxxxxx>; davem@xxxxxxxxxxxxx;
> > > > > netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> > > > > Subject: Re: [PATCH net-next] netdevsim: Add max_vfs to bus_dev
> > > > >
> > > > > On Wed, 11 Dec 2019 18:19:56 +0000, Yuval Avnery wrote:
> > > > > > > On Wed, 11 Dec 2019 04:58:53 +0200, Yuval Avnery wrote:
> > > > > > > > Currently there is no limit to the number of VFs netdevsim can
> enable.
> > > > > > > > In a real systems this value exist and used by driver.
> > > > > > > > Fore example, Some features might need to consider this
> > > > > > > > value when allocating memory.
> > > > > > >
> > > > > > > Thanks for the patch!
> > > > > > >
> > > > > > > Can you shed a little bit more light on where it pops up?
> > > > > > > Just for my
> > > > > curiosity?
> > > > > >
> > > > > > Yes, like we described in the subdev threads.
> > > > > > User should be able to configure some attributes before the VF
> > > > > > was
> > > > > enabled.
> > > > > > So all those (persistent) VF attributes should be available
> > > > > > for query and configuration before VF was enabled.
> > > > > > The driver can allocate an array according to max_vfs to hold
> > > > > > all that data, like we do here in" vfconfigs".
> > > > >
> > > > > I was after more practical reasoning, are you writing some tests
> > > > > for subdev stuff that will depend on this change? :)
> > > >
> > > > Yes we are writing tests for subdev with this.
> > >
> > > Okay, please post v2 together with the tests. We don't accept
> > > netdevsim features without tests any more.
> >
> > I think the only test I can currently write is the enable SR-IOV max_vfs
> enforcement.
> > Because subdev is not in yet.
> > Will that be good enough?
>
> It'd be good to test some netdev API rather than just the enforcement itself
> which is entirely in netdevsim, I think.
>
> So max_vfs enforcement plus checking that ip link lists the correct number of
> entries (and perhaps the entries are in reset state after
> enable) would do IMO.
Ok, but this is possible regardless of my patch (to enable vfs).
>
> > > > This is the way mlx5 works.. is that practical enough?
> > > >
> > > > > > > > Signed-off-by: Yuval Avnery <yuvalav@xxxxxxxxxxxx>
> > > > > > > > Acked-by: Jiri Pirko <jiri@xxxxxxxxxxxx>
> > > > > > > >
> > > > > > > > diff --git a/drivers/net/netdevsim/bus.c
> > > > > > > > b/drivers/net/netdevsim/bus.c index
> > > > > > > > 6aeed0c600f8..f1a0171080cb
> > > > > > > > 100644
> > > > > > > > --- a/drivers/net/netdevsim/bus.c
> > > > > > > > +++ b/drivers/net/netdevsim/bus.c
> > > > > > > > @@ -26,9 +26,9 @@ static struct nsim_bus_dev
> > > > > > > > *to_nsim_bus_dev(struct device *dev) static int
> > > > > > > > nsim_bus_dev_vfs_enable(struct nsim_bus_dev
> > > > > > > *nsim_bus_dev,
> > > > > > > > unsigned int num_vfs) {
> > > > > > > > - nsim_bus_dev->vfconfigs = kcalloc(num_vfs,
> > > > > > > > - sizeof(struct
> > > nsim_vf_config),
> > > > > > > > - GFP_KERNEL);
> > > > > > >
> > > > > > > You're changing the semantics of the enable/disable as well now.
> > > > > > > The old values used to be wiped when SR-IOV is disabled, now
> > > > > > > they will be retained across disable/enable pair.
> > > > > > >
> > > > > > > I think it'd be better if that wasn't the case. Users may
> > > > > > > expect a system to be in the same state after they enable
> > > > > > > SR-IOV, regardless if someone else used SR-IOV since last reboot.
> > > > > >
> > > > > > Right,
> > > > > > But some values should retain across enable/disable, for
> > > > > > example MAC
> > > > > address which is persistent.
> > > > > > So maybe we need to retain some values, while resetting others
> > > > > > on
> > > > > disable?
> > > > > > Would that work?
> > > > >
> > > > > Mmm. That is a good question. For all practical purposes SR-IOV
> > > > > used to be local to the host that enables it until Smart/middle
> > > > > box NICs
> > > emerged.
> > > > >
> > > > > Perhaps the best way forward would be to reset the config that
> > > > > was set via legacy APIs and keep only the MACs provisioned via
> > > > > persistent
> > > devlink API?
> > > > >
> > > > > So for now we'd memset, and once devlink API lands reset
> selectively?
> > > >
> > > > Legacy is also persistent.
> > > > Currently when you set mac address with "ip link vf set mac" it is
> > > > persistent
> > > (at least in mlx5).
> > >
> > > "Currently in mlx5" - maybe, but this is netdevsim. Currently it
> > > clears the config on re-enable which I believe to be preferable as
> explained before.
> > >
> > > > But ip link only exposes enabled VFS, so driver on VF has to
> > > > reload to
> > > acquire this MAC.
> > > > With devlink subdev it will be possible to set the MAC before VF
> > > > was
> > > enabled.
> > >
> > > Yup, sure. As I said, once subdev is implemented, we will treat the
> > > addresses set by it differently. Those are inherently persistent or
> > > rather their life time is independent of just the SR-IOV host.
> >
> > Ok, got it.
> > I am just wondering how this works when you have "ip link" and devlink
> setting the MAC independently.
> > Will they show the same MAC?
> > Or ip link will show the non-persistent MAC And devlink the persistent?
>
> My knee jerk reaction is that we should populate the values to those set via
> devlink upon SR-IOV enable, but then if user overwrites those values that's
> their problem.
>
> Sort of mirror how VF MAC addrs work, just a level deeper. The VF defaults
> to the MAC addr provided by the PF after reset, but it can change it to
> something else (things may stop working because spoof check etc. will drop
> all its frames, but nothing stops the VF in legacy HW from writing its MAC
> addr register).
>
> IOW the devlink addr is the default/provisioned addr, not necessarily the
> addr the PF has set _now_.
>
> Other options I guess are (a) reject the changes of the address from the PF
> once devlink has set a value; (b) provide some device->control CPU notifier
> which can ack/reject a request from the PF to change devlink's value..?
>
> You guys posted the devlink patches a while ago, what was your
> implementation doing?
devlink simply calls the driver with set or get.
It is up to the vendor driver/HW if to make this address persistent or not.
The address is not saved in the devlink layer.
The MAC address in mlx5 is stored in the HW and
persistent (until PF reset) , whether it is set by devlink or ip link.
So from what I understand, we have the freedom to choose how netdevsim
behave in this case, which means non-persistent is ok.