[PATCH 3.16 187/306] net/mlx4_en: Fix potential deadlock in port statistics flow

From: Ben Hutchings
Date: Wed Feb 15 2017 - 19:32:26 EST


3.16.40-rc1 review patch. If anyone has any objections, please let me know.

------------------

From: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx>

commit d2582a03939ed0a80ffcd3ea5345505bc8067c54 upstream.

mlx4_en_DUMP_ETH_STATS took the *counter mutex* and then
called the FW command, with WRAPPED attribute. As a result, the fw command
is wrapped on the Hypervisor when it calls mlx4_en_DUMP_ETH_STATS.
The FW command wrapper flow on the hypervisor takes the *slave_cmd_mutex*
during processing.

At the same time, a VF could be in the process of coming up, and could
call mlx4_QUERY_FUNC_CAP. On the hypervisor, the command flow takes the
*slave_cmd_mutex*, then executes mlx4_QUERY_FUNC_CAP_wrapper.
mlx4_QUERY_FUNC_CAP wrapper calls mlx4_get_default_counter_index(),
which takes the *counter mutex*. DEADLOCK.

The fix is that the DUMP_ETH_STATS fw command should be called with
the NATIVE attribute, so that on the hypervisor, this command does not
enter the wrapper flow.

Since the Hypervisor no longer goes through the wrapper code, we also
simply return 0 in mlx4_DUMP_ETH_STATS_wrapper (i.e.the function succeeds,
but the returned data will be all zeroes).
No need to test if it is the Hypervisor going through the wrapper.

Fixes: f9baff509f8a ("mlx4_core: Add "native" argument to mlx4_cmd ...")
Signed-off-by: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Tariq Toukan <tariqt@xxxxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
[bwh: Backported to 3.16: mlx4_en_DUMP_ETH_STATS() only uses this command once]
Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
---
--- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
@@ -127,7 +127,7 @@ int mlx4_en_DUMP_ETH_STATS(struct mlx4_e
return PTR_ERR(mailbox);
err = mlx4_cmd_box(mdev->dev, 0, mailbox->dma, in_mod, 0,
MLX4_CMD_DUMP_ETH_STATS, MLX4_CMD_TIME_CLASS_B,
- MLX4_CMD_WRAPPED);
+ MLX4_CMD_NATIVE);
if (err)
goto out;

--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -1249,8 +1249,6 @@ int mlx4_SET_VLAN_FLTR_wrapper(struct ml
struct mlx4_cmd_info *cmd);
int mlx4_common_set_vlan_fltr(struct mlx4_dev *dev, int function,
int port, void *buf);
-int mlx4_common_dump_eth_stats(struct mlx4_dev *dev, int slave, u32 in_mod,
- struct mlx4_cmd_mailbox *outbox);
int mlx4_DUMP_ETH_STATS_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_vhcr *vhcr,
struct mlx4_cmd_mailbox *inbox,
--- a/drivers/net/ethernet/mellanox/mlx4/port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/port.c
@@ -1143,24 +1143,13 @@ int mlx4_SET_VLAN_FLTR_wrapper(struct ml
return err;
}

-int mlx4_common_dump_eth_stats(struct mlx4_dev *dev, int slave,
- u32 in_mod, struct mlx4_cmd_mailbox *outbox)
-{
- return mlx4_cmd_box(dev, 0, outbox->dma, in_mod, 0,
- MLX4_CMD_DUMP_ETH_STATS, MLX4_CMD_TIME_CLASS_B,
- MLX4_CMD_NATIVE);
-}
-
int mlx4_DUMP_ETH_STATS_wrapper(struct mlx4_dev *dev, int slave,
struct mlx4_vhcr *vhcr,
struct mlx4_cmd_mailbox *inbox,
struct mlx4_cmd_mailbox *outbox,
struct mlx4_cmd_info *cmd)
{
- if (slave != dev->caps.function)
- return 0;
- return mlx4_common_dump_eth_stats(dev, slave,
- vhcr->in_modifier, outbox);
+ return 0;
}

void mlx4_set_stats_bitmap(struct mlx4_dev *dev, u64 *stats_bitmap)