[PATCH AUTOSEL for 4.9 246/281] net: ena: fix rare uncompleted admin command false alarm

From: Sasha Levin
Date: Mon Mar 19 2018 - 13:45:12 EST


From: Netanel Belgazal <netanel@xxxxxxxxxx>

[ Upstream commit a77c1aafcc906f657d1a0890c1d898be9ee1d5c9 ]

The current flow to detect admin completion is:
while (command_not_completed) {
if (timeout)
error

check_for_completion()
sleep()
}
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.

The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.

Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@xxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx>
---
drivers/net/ethernet/amazon/ena/ena_com.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index e2512ab41168..564b556baba9 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -508,15 +508,20 @@ static int ena_com_comp_status_to_errno(u8 comp_status)
static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_ctx,
struct ena_com_admin_queue *admin_queue)
{
- unsigned long flags;
- u32 start_time;
+ unsigned long flags, timeout;
int ret;

- start_time = ((u32)jiffies_to_usecs(jiffies));
+ timeout = jiffies + ADMIN_CMD_TIMEOUT_US;
+
+ while (1) {
+ spin_lock_irqsave(&admin_queue->q_lock, flags);
+ ena_com_handle_admin_completion(admin_queue);
+ spin_unlock_irqrestore(&admin_queue->q_lock, flags);

- while (comp_ctx->status == ENA_CMD_SUBMITTED) {
- if ((((u32)jiffies_to_usecs(jiffies)) - start_time) >
- ADMIN_CMD_TIMEOUT_US) {
+ if (comp_ctx->status != ENA_CMD_SUBMITTED)
+ break;
+
+ if (time_is_before_jiffies(timeout)) {
pr_err("Wait for completion (polling) timeout\n");
/* ENA didn't have any completion */
spin_lock_irqsave(&admin_queue->q_lock, flags);
@@ -528,10 +533,6 @@ static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_c
goto err;
}

- spin_lock_irqsave(&admin_queue->q_lock, flags);
- ena_com_handle_admin_completion(admin_queue);
- spin_unlock_irqrestore(&admin_queue->q_lock, flags);
-
msleep(100);
}

--
2.14.1