[PATCH net] net/mlx5e: Use sender devcom for MPV master-up

From: Manjunath Patil

Date: Wed Jun 10 2026 - 13:42:39 EST


After PCIe DPC recovery, mlx5 reloads the affected functions and
replays multiport affiliation events. In the reported failure, the
first relevant device error was:

pcieport 0000:10:01.1: DPC: containment event
pcieport 0000:10:01.1: PCIe Bus Error: severity=Uncorrected (Fatal)
pcieport 0000:10:01.1: [ 5] SDES (First)

mlx5 recovered the PCI functions and resumed 0000:11:00.1. During
that resume, RDMA multiport binding replayed
MLX5_DRIVER_EVENT_AFFILIATION_DONE and mlx5e sent
MPV_DEVCOM_MASTER_UP. The host then panicked with:

BUG: kernel NULL pointer dereference, address: 0000000000000010
RIP: mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core]
RDI: 0000000000000000

Call trace included:

mlx5_devcom_comp_set_ready
mlx5e_devcom_event_mpv
mlx5_devcom_send_event
mlx5_ib_bind_slave_port
mlx5r_mp_probe
mlx5_pci_resume

MPV devcom registration publishes mlx5e private data to the component
peer list before mlx5e_devcom_init_mpv() stores the returned component
device in priv->devcom. A concurrent master-up event can therefore
reach a peer whose private data is visible but whose priv->devcom
backpointer is still NULL.

MPV_DEVCOM_MASTER_UP already carries the sender/master mlx5e private
data as event_data. The ready bit is stored on the shared devcom
component, not on an individual peer. Use the sender devcom when
marking the MPV component ready.

This preserves the readiness transition while avoiding a NULL
dereference of the peer devcom pointer during affiliation replay after
PCI error recovery.

Fixes: bf11485f8419 ("net/mlx5: Register mlx5e priv to devcom in MPV mode")
Assisted-by: Codex:gpt-5
Signed-off-by: Manjunath Patil <manjunath.b.patil@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # 6.7+
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8f2b3abe0092..f7ff20b97e8c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -211,11 +211,14 @@ static void mlx5e_disable_async_events(struct mlx5e_priv *priv)

static int mlx5e_devcom_event_mpv(int event, void *my_data, void *event_data)
{
- struct mlx5e_priv *slave_priv = my_data;
+ struct mlx5e_priv *master_priv = event_data;

switch (event) {
case MPV_DEVCOM_MASTER_UP:
- mlx5_devcom_comp_set_ready(slave_priv->devcom, true);
+ if (!master_priv || !master_priv->devcom)
+ return -EINVAL;
+
+ mlx5_devcom_comp_set_ready(master_priv->devcom, true);
break;
case MPV_DEVCOM_MASTER_DOWN:
/* no need for comp set ready false since we unregister after
--
2.47.3