Re: [PATCH net-next] devlink: Require devlink lock during device reload
From: Ido Schimmel
Date: Sun Nov 07 2021 - 12:16:13 EST
On Mon, Nov 01, 2021 at 04:11:22PM -0700, Jakub Kicinski wrote:
> On Mon, 1 Nov 2021 22:52:19 +0200 Ido Schimmel wrote:
> > > >Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx>
> > >
> > > Looks fine to me.
> > >
> > > Reviewed-by: Jiri Pirko <jiri@xxxxxxxxxx>
> >
> > Traces from mlxsw / netdevsim below:
>
> Thanks a lot for the testing Ido!
>
> Would you mind giving my RFC a spin as well on your syzbot machinery?
Sorry for the delay. I didn't have a lot of time last week.
I tried to apply your set [1] on top of net-next, but I'm getting a
conflict with patch #5. Can you send me (here / privately) a link to a
git tree that has the patches on top of net-next?
TBH, if you ran the netdevsim selftests with a debug config and nothing
exploded, then I don't expect syzkaller to find anything (major).
[1] https://lore.kernel.org/netdev/20211030231254.2477599-1-kuba@xxxxxxxxxx/
>
> Any input on the three discussion points there?
>
> (1) should we have a "locked" and "unlocked" API or use lock nesting?
Judging by the netdevsim conversion, it seems that for the vast majority
of APIs (if not all) we will only have an "unlocked" API. Drivers will
hold the devlink instance lock on probe / remove and devlink itself will
hold the lock when calling into drivers (e.g., on reload, port split).
>
> (2) should we expose devlink lock so that drivers can use devlink
> as a framework for their locking needs?
It is better than dropping locks (e.g., DEVLINK_NL_FLAG_NO_LOCK, which I
expect will go away after the conversion). With the asserts you put in
place, misuses will be caught early.
>
> (3) should we let drivers take refs on the devlink instance?
I think it's fine mainly because I don't expect it to be used by too
many drivers other than netdevsim which is somewhat special. Looking at
the call sites of devlink_get() in netdevsim, it is only called from
places (debugfs and trap workqueue) that shouldn't be present in real
drivers.
The tl;dr is that your approach makes sense to me. I was initially
worried that we will need to propagate a "reload" argument everywhere in
drivers, but you wrote "The expectation is that driver will take the
devlink instance lock on its probe and remove paths", which avoids that.
Thanks for working on that