Re: [PATCH net 1/3] net/mlx5e: SD, Fix race condition in secondary device probe/remove
From: Jakub Kicinski
Date: Thu Apr 02 2026 - 20:46:36 EST
On Thu, 2 Apr 2026 23:03:10 +0300 Shay Drori wrote:
> On 02/04/2026 6:08, Jakub Kicinski wrote:
> > On Mon, 30 Mar 2026 22:34:10 +0300 Tariq Toukan wrote:
> >> From: Shay Drory <shayd@xxxxxxxxxx>
> >>
> >> When utilizing Socket-Direct single netdev functionality the driver
> >> resolves the actual auxiliary device using mlx5_sd_get_adev(). However,
> >> the current implementation returns the primary ETH auxiliary device
> >> without holding the device lock, leading to a potential race condition
> >> where the ETH device could be unbound or removed concurrently during
> >> probe, suspend, resume, or remove operations.[1]
> >>
> >> Fix this by introducing mlx5_sd_put_adev() and updating
> >> mlx5_sd_get_adev() so that secondaries devices would acquire the device
> >> lock of the returned auxiliary device. After the lock is acquired, a
> >> second devcom check is needed[2].
> >> In addition, update The callers to pair the get operation with the new
> >> put operation, ensuring the lock is held while the auxiliary device is
> >> being operated on and released afterwards.
> >
> > Please explain why the "primary" designation is reliable, and therefore
> > we can be sure there will be no ABBA deadlock here
>
> The "primary" designation is determined once in sd_register(). It's set
> before devcom is marked ready, and it never changes after that.
> In Addition, The primary path never locks a secondary: When the primary
> device invoke mlx5_sd_get_adev(), it sees dev == primary and returns.
> no additional lock is taken.
> Therefore lock ordering is always: secondary_lock → primary_lock. The
> reverse never happens, so ABBA deadlock is impossible.
And the device_lock instances have separate lockdep classes?
So lockdep will also understand this?
> Does the above is the explanation you looked for?
> If not, can you elaborate?
> If yes, to add it to the commit message in V2?
Sounds good, please add to the msg.
> >> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> >> index b6c12460b54a..5761f655f488 100644
> >> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> >> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> >> @@ -6657,8 +6657,11 @@ static int mlx5e_resume(struct auxiliary_device *adev)
> >> return err;
> >>
> >> actual_adev = mlx5_sd_get_adev(mdev, adev, edev->idx);
> >> - if (actual_adev)
> >> - return _mlx5e_resume(actual_adev);
> >> + if (actual_adev) {
> >> + err = _mlx5e_resume(actual_adev);
> >> + mlx5_sd_put_adev(actual_adev, adev);
> >> + return err;
> >> + }
> >> return 0;
> >
> > Feels like I recently complained about similar code y'all were trying
> > to add. Magically and conditionally locking something in a get helper
> > makes for extremely confusing code.
>
> Do you think explicit locking API is preferred here?
> something like:
> new_locking_api()
>
> mlx5_sd_get_adev()
>
> new_unlocking_api()
Readability is hard, I'd just push the locking up into the callers TBH.
Looks like there's only 4, the LoC delta isn't going to be huge.