Re: [PATCH net-next] net: ag71xx: disable napi interrupts during probe

From: Rosen Penev
Date: Thu Aug 29 2024 - 13:46:56 EST


On Wed, Aug 28, 2024 at 2:05 PM Jacob Keller <jacob.e.keller@xxxxxxxxx> wrote:
>
>
>
> On 8/28/2024 1:41 PM, Rosen Penev wrote:
> > From: Sven Eckelmann <sven@xxxxxxxxxxxxx>
> >
> > ag71xx_probe is registering ag71xx_interrupt as handler for gmac0/gmac1
> > interrupts. The handler is trying to use napi_schedule to handle the
> > processing of packets. But the netif_napi_add for this device is
> > called a lot later in ag71xx_probe.
> >
> > It can therefore happen that a still running gmac0/gmac1 is triggering the
> > interrupt handler with a bit from AG71XX_INT_POLL set in
> > AG71XX_REG_INT_STATUS. The handler will then call napi_schedule and the
> > napi code will crash the system because the ag->napi is not yet
> > initialized.
> >
> > The gmcc0/gmac1 must be brought in a state in which it doesn't signal a
> > AG71XX_INT_POLL related status bits as interrupt before registering the
> > interrupt handler. ag71xx_hw_start will take care of re-initializing the
> > AG71XX_REG_INT_ENABLE.
> >
> > Signed-off-by: Sven Eckelmann <sven@xxxxxxxxxxxxx>
> > Signed-off-by: Rosen Penev <rosenp@xxxxxxxxx>
> > ---
>
> The description reads like a bug fix, so I would expect this to be
> targeted to net and have a Fixes tag indicating what commit introduced
> the issue, maybe:
>
> Fixes: d51b6ce441d3 ("net: ethernet: add ag71xx driver")
>
> The change seems reasonable to me otherwise.
OTOH there are currently no dual GMAC users upstream. Just single.

>
> > drivers/net/ethernet/atheros/ag71xx.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c
> > index 0674a042e8d3..435c4b19acdd 100644
> > --- a/drivers/net/ethernet/atheros/ag71xx.c
> > +++ b/drivers/net/ethernet/atheros/ag71xx.c
> > @@ -1855,6 +1855,12 @@ static int ag71xx_probe(struct platform_device *pdev)
> > if (!ag->mac_base)
> > return -ENOMEM;
> >
> > + /* ensure that HW is in manual polling mode before interrupts are
> > + * activated. Otherwise ag71xx_interrupt might call napi_schedule
> > + * before it is initialized by netif_napi_add.
> > + */
> > + ag71xx_int_disable(ag, AG71XX_INT_POLL);
> > +
> > ndev->irq = platform_get_irq(pdev, 0);
> > err = devm_request_irq(&pdev->dev, ndev->irq, ag71xx_interrupt,
> > 0x0, dev_name(&pdev->dev), ndev);