[PATCH can v2 1/2] can: m_can: enable NAPI before enabling interrupts
From: Marc Kleine-Budde
Date: Mon Sep 09 2024 - 10:30:15 EST
From: "Hamby, Jake (US)" <Jake.Hamby@xxxxxxxxxxxx>
If any error flags are set when bringing up the CAN device, e.g. due
to CAN bus traffic before initializing the device, when m_can_start()
is called and interrupts are enabled, m_can_isr() is called
immediately, which disables all CAN interrupts and calls
napi_schedule().
Because napi_enable() isn't called until later in m_can_open(), the
call to napi_schedule() never schedules the m_can_poll() callback and
the device is left with interrupts disabled and can't receive any CAN
packets until rebooted.
This can be verified by running "cansend" from another device before
setting the bitrate and calling "ip link set up can0" on the test
device. Adding debug lines to m_can_isr() shows it's called with flags
(IR_EP | IR_EW | IR_CRCE), which calls m_can_disable_all_interrupts()
and napi_schedule(), and then m_can_poll() is never called.
Move the call to napi_enable() above the call to m_can_start() to
enable any initial interrupt flags to be handled by m_can_pol() so
that interrupts are reenabled. Add a call to napi_disable() in the
error handling section of m_can_open(), to handle the case where later
functions return errors.
Also, in m_can_close(), move the call to napi_disable() below the call
to m_can_stop() to ensure all interrupts are handled when bringing
down the device. This race condition is much less likely to occur.
Tested on a Microchip SAMA7G54 MPU. The fix should be applicable to
any SoC with a Bosch M_CAN controller.
Not-Signed-off-by: Hamby, Jake (US) <Jake.Hamby@xxxxxxxxxxxx>
Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support")
Signed-off-by: Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx>
---
drivers/net/can/m_can/m_can.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 012c3d22b01dd3d8558f2a40448770ca1da1aa1e..7754dd2d4cb110eee5b83885f5381aed9c67ce03 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1763,13 +1763,14 @@ static int m_can_close(struct net_device *dev)
netif_stop_queue(dev);
- if (!cdev->is_peripheral)
- napi_disable(&cdev->napi);
-
m_can_stop(dev);
m_can_clk_stop(cdev);
free_irq(dev->irq, dev);
+ /* disable NAPI after disabling interrupts */
+ if (!cdev->is_peripheral)
+ napi_disable(&cdev->napi);
+
m_can_clean(dev);
if (cdev->is_peripheral) {
@@ -2031,6 +2032,10 @@ static int m_can_open(struct net_device *dev)
if (cdev->is_peripheral)
can_rx_offload_enable(&cdev->offload);
+ /* enable NAPI before enabling interrupts */
+ if (!cdev->is_peripheral)
+ napi_enable(&cdev->napi);
+
/* register interrupt handler */
if (cdev->is_peripheral) {
cdev->tx_wq = alloc_ordered_workqueue("mcan_wq",
@@ -2063,9 +2068,6 @@ static int m_can_open(struct net_device *dev)
if (err)
goto exit_start_fail;
- if (!cdev->is_peripheral)
- napi_enable(&cdev->napi);
-
netif_start_queue(dev);
return 0;
@@ -2077,6 +2079,8 @@ static int m_can_open(struct net_device *dev)
if (cdev->is_peripheral)
destroy_workqueue(cdev->tx_wq);
out_wq_fail:
+ if (!cdev->is_peripheral)
+ napi_disable(&cdev->napi);
if (cdev->is_peripheral)
can_rx_offload_disable(&cdev->offload);
close_candev(dev);
--
2.45.2