[PATCH v3] IB/mlx5: Fix tdn leak and state corruption in mlx5_ib_alloc_transport_domain

From: Prathamesh Deshpande

Date: Wed Apr 01 2026 - 18:37:15 EST


In mlx5_ib_alloc_transport_domain(), an early success path was
returning 'err' (which is 0) instead of a literal 0.

Additionally, as identified by Sashiko, if mlx5_ib_enable_lb() fails
at the end of the function:
1. The allocated transport domain (tdn) is leaked.
2. The internal loopback software state and reference counters are
left in an inconsistent state.

Explicitly return 0 in the early success path. In the failure path for
loopback enablement, call mlx5_ib_disable_lb() to roll back the software
state and deallocate the transport domain.

Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@xxxxxxxxx>
---
v3:
- Also call mlx5_ib_disable_lb() on failure to roll back software state/counters
[Sashiko].
v2:
- Added deallocation of tdn if mlx5_ib_enable_lb() fails [Sashiko].
- Reworded commit message to reflect the functional fix and credit the tool.

Hi Leon,

In this v3, I've incorporated the additional fix identified by Sashiko.
Beyond the tdn leak, Sashiko pointed out that a failure in
mlx5_ib_enable_lb() leaves internal software state and counters
inconsistent. I've added a call to mlx5_ib_disable_lb() in the
error path to safely roll back those changes.

Thanks,
Prathamesh

drivers/infiniband/hw/mlx5/main.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 635002e684a5..3d9f0e2e7548 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2068,9 +2068,15 @@ static int mlx5_ib_alloc_transport_domain(struct mlx5_ib_dev *dev, u32 *tdn,
if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
(!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
!MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
- return err;
+ return 0;
+
+ err = mlx5_ib_enable_lb(dev, true, false);
+ if (err) {
+ mlx5_ib_disable_lb(dev, true, false);
+ mlx5_cmd_dealloc_transport_domain(dev->mdev, *tdn, uid);
+ }

- return mlx5_ib_enable_lb(dev, true, false);
+ return err;
}

static void mlx5_ib_dealloc_transport_domain(struct mlx5_ib_dev *dev, u32 tdn,
--
2.43.0