Re: [PATCH net-next 3/3] net/mlx5: Apply devlink default eswitch mode during init
From: Jiri Pirko
Date: Tue May 26 2026 - 10:10:22 EST
Tue, May 26, 2026 at 11:44:46AM +0200, mbloch@xxxxxxxxxx wrote:
>
>
>On 26/05/2026 10:44, Jiri Pirko wrote:
>> Thu, May 21, 2026 at 09:24:34AM +0200, tariqt@xxxxxxxxxx wrote:
>>> From: Mark Bloch <mbloch@xxxxxxxxxx>
>>>
>>> Apply devlink default eswitch mode for mlx5 devices after successful
>>> device initialization while holding the devlink instance lock.
>>>
>>> At this point the devlink instance is registered and the mlx5 devlink
>>> operations are available, so the default eswitch mode can be applied to
>>> the matching PCI devlink handle.
>>>
>>> Signed-off-by: Mark Bloch <mbloch@xxxxxxxxxx>
>>> Reviewed-by: Shay Drori <shayd@xxxxxxxxxx>
>>> Reviewed-by: Moshe Shemesh <moshe@xxxxxxxxxx>
>>> Signed-off-by: Tariq Toukan <tariqt@xxxxxxxxxx>
>>> ---
>>> drivers/net/ethernet/mellanox/mlx5/core/main.c | 17 +++++++++++++++++
>>> 1 file changed, 17 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>> index 0c6e4efe38c8..4528097f3d84 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>> @@ -1391,6 +1391,21 @@ static void mlx5_unload(struct mlx5_core_dev *dev)
>>> mlx5_free_bfreg(dev, &dev->priv.bfreg);
>>> }
>>>
>>> +static void mlx5_devl_apply_default_esw_mode(struct mlx5_core_dev *dev)
>>> +{
>>> + struct devlink *devlink = priv_to_devlink(dev);
>>> + int err;
>>> +
>>> + if (!MLX5_ESWITCH_MANAGER(dev))
>>> + return;
>>> +
>>> + devl_assert_locked(devlink);
>>> + err = devl_apply_default_esw_mode(devlink);
>>> + if (err)
>>> + mlx5_core_warn(dev, "Couldn't apply default eswitch mode, err %d\n",
>>> + err);
>>> +}
>>> +
>>> int mlx5_init_one_devl_locked(struct mlx5_core_dev *dev)
>>> {
>>> bool light_probe = mlx5_dev_is_lightweight(dev);
>>> @@ -1437,6 +1452,7 @@ int mlx5_init_one_devl_locked(struct mlx5_core_dev *dev)
>>> mlx5_core_err(dev, "mlx5_hwmon_dev_register failed with error code %d\n", err);
>>>
>>> mutex_unlock(&dev->intf_state_mutex);
>>> + mlx5_devl_apply_default_esw_mode(dev);
>>
>> I wonder how we can make this work for all. I mean, other driver would
>> silently ignore this command like arg, right? Any idea how to make all
>> drivers follow the arg from very beginning?
>>
>
>I have a follow-up series that adds the call to all drivers which support
>setting eswitch mode. When going over the other drivers, what I found is
>that the right point to apply the default is driver specific, drivers
>I have patch for:
>
>46e16c6d9836 net: Apply devlink esw mode defaults
>ab4f54102ba9 bnxt_en: Apply devlink default eswitch mode during init
>b48cce1607bb liquidio: Apply devlink default eswitch mode during init
>4ea54b0fe04a ice: Apply devlink default eswitch mode during init
>b7faddaa1c90 octeontx2-af: Apply devlink default eswitch mode during init
>74b0c22c47b9 octeontx2-pf: Apply devlink default eswitch mode during init
>5000e4c3d768 nfp: Apply devlink default eswitch mode during init
>97a218e95e41 netdevsim: Apply devlink default eswitch mode during init
>
>I don't think doing this generically from devlink is realistic. devlink
>doesn't really know when a given driver is ready to change eswitch mode.
>Some drivers need SR-IOV state, representor setup, or other init pieces to
>be ready first, and the locking is not identical across drivers either.
Low hanging fruit would be just to call ops->eswitch_mode_set at the end
of register. Multiple reasons:
1) end of devl_register is exactly the point userspace is free to issue
the eswitch mode set. Driver should be ready to handle it.
2) all drivers would transparently get this functionality, without
actually knowing this kernel command line arg ever existed, without
odd wiring call of related exported function. I prefer that stongly.
3) you should add a there warning for the case this arg is passed yet
the driver does not implement eswitch_mode_set. User should
get a feedback like this, not silent ignore.
The only loose end is see it the void return of devl_register().
Multiple ways to handle the possibly failed eswitch_mode_set(). I would
probably just go for pr_warn, seems to be the most correct.
Make sense?
>
>Also, since this knob is only about eswitch mode, I don't think we need to
>touch every devlink driver. Drivers that don't implement eswitch_mode_set()
>would just ignore it anyway. The follow-up only wires the default into
>drivers that actually support changing eswitch mode.
>
>Mark
>
>>
>>> return 0;
>>>
>>> err_register:
>>> @@ -1538,6 +1554,7 @@ int mlx5_load_one_devl_locked(struct mlx5_core_dev *dev, bool recovery)
>>> goto err_attach;
>>>
>>> mutex_unlock(&dev->intf_state_mutex);
>>> + mlx5_devl_apply_default_esw_mode(dev);
>>> return 0;
>>>
>>> err_attach:
>>> --
>>> 2.44.0
>>>
>