Re: [PATCH net-next v2 5/5] devlink: Delete reload enable/disable interface

From: Leon Romanovsky
Date: Mon Oct 04 2021 - 15:02:15 EST


On Mon, Oct 04, 2021 at 07:54:02PM +0300, Ido Schimmel wrote:
> On Mon, Oct 04, 2021 at 06:45:07PM +0300, Leon Romanovsky wrote:
> > On Mon, Oct 04, 2021 at 05:19:40PM +0300, Ido Schimmel wrote:
> > > On Sun, Oct 03, 2021 at 09:12:06PM +0300, Leon Romanovsky wrote:
> > > > From: Leon Romanovsky <leonro@xxxxxxxxxx>
> > > >
> > > > After changes to allow dynamically set the reload_up/_down callbacks,
> > > > we ensure that properly supported devlink ops are not accessible before
> > > > devlink_register, which is last command in the initialization sequence.
> > > >
> > > > It makes devlink_reload_enable/_disable not relevant anymore and can be
> > > > safely deleted.
> > > >
> > > > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx>
> > >
> > > [...]
> > >
> > > > diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
> > > > index cb6645012a30..09e48fb232a9 100644
> > > > --- a/drivers/net/netdevsim/dev.c
> > > > +++ b/drivers/net/netdevsim/dev.c
> > > > @@ -1512,7 +1512,6 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
> > > >
> > > > nsim_dev->esw_mode = DEVLINK_ESWITCH_MODE_LEGACY;
> > > > devlink_register(devlink);
> > > > - devlink_reload_enable(devlink);
> > > > return 0;
> > > >
> > > > err_psample_exit:
> > > > @@ -1566,9 +1565,7 @@ void nsim_dev_remove(struct nsim_bus_dev *nsim_bus_dev)
> > > > struct nsim_dev *nsim_dev = dev_get_drvdata(&nsim_bus_dev->dev);
> > > > struct devlink *devlink = priv_to_devlink(nsim_dev);
> > > >
> > > > - devlink_reload_disable(devlink);
> > > > devlink_unregister(devlink);
> > > > -
> > > > nsim_dev_reload_destroy(nsim_dev);
> > > >
> > > > nsim_bpf_dev_exit(nsim_dev);
> > >
> > > I didn't remember why devlink_reload_{enable,disable}() were added in
> > > the first place so it was not clear to me from the commit message why
> > > they can be removed. It is described in commit a0c76345e3d3 ("devlink:
> > > disallow reload operation during device cleanup") with a reproducer.
> >
> > It was added because devlink ops were accessible by the user space very
> > early in the driver lifetime. All my latest devlink patches are the
> > attempt to fix this arch/design/implementation issue.
>
> The reproducer in the commit message executed the reload after the
> device was fully initialized. IIRC, the problem there was that nothing
> prevented these two tasks from racing:
>
> devlink dev reload netdevsim/netdevsim10
> echo 10 > /sys/bus/netdevsim/del_device
>
> The title also talks about forbidding reload during device cleanup.

It is incomplete title and reproducer. In our verification, we observed
more than 40 bugs related to devlink reload flows and races around it.

>
> >
> > >
> > > Tried the reproducer with this series and I cannot reproduce the issue.
> > > Wasn't quite sure why, but it does not seem to be related to "changes to
> > > allow dynamically set the reload_up/_down callbacks", as this seems to
> > > be specific to mlx5.
> >
> > You didn't reproduce because of my series that moved
> > devlink_register()/devlink_unregister() to be last/first commands in
> > .probe()/.remove() flows.
>
> Agree, that is what I wrote in the next paragraph of my reply.
>
> >
> > Patch to allow dynamically set ops was needed because mlx5 had logic
> > like this:
> > if(something)
> > devlink_reload_enable()
> >
> > And I needed a way to keep this if ... condition.
> >
> > >
> > > IIUC, the reason that the race described in above mentioned commit can
> > > no longer happen is related to the fact that devlink_unregister() is
> > > called first in the device dismantle path, after your previous patches.
> > > Since both the reload operation and devlink_unregister() hold
> > > 'devlink_mutex', it is not possible for the reload operation to race
> > > with device dismantle.
> > >
> > > Agree? If so, I think it would be good to explain this in the commit
> > > message unless it's clear to everyone else.
> >
> > I don't agree for very simple reason that devlink_mutex is going to be
> > removed very soon and it is really not a reason why devlink reload is
> > safer now when before.
> >
> > The reload can't race due to:
> > 1. devlink_unregister(), which works as a barrier to stop accesses
> > from the user space.
> > 2. reference counting that ensures that all in-flight commands are counted.
> > 3. wait_for_completion that blocks till all commands are done.
>
> So the wait_for_completion() is what prevents the race, not
> 'devlink_mutex' that is taken later. This needs to be explained in the
> commit message to make it clear why the removal is safe.

Can you please suggest what exactly should I write in the commit message
to make it clear?

I'm too much into this delvink stuff already and for me this patch is
trivial. IMHO, that change doesn't need an explanation at all because
coding pattern of refcount + wait_for_completion is pretty common in the
kernel. So I think that I explained good enough: move of
devlink_register/devlink_unregister obsoletes the devlink_reload_* APIs.

I have no problem to update the commit message, just help me with the
message.

Thanks

>
> Thanks