Re: [PATCH net-next 00/10] net: lan966x: add support for PCIe FDMA

From: Daniel Machon

Date: Tue Apr 07 2026 - 09:21:01 EST

Hi Hervé,

> >
> > As I remembered, doing rmmod on the lan966x_switch followed by modprobe
> > lan966x_switch works fine. This is because neither the switch core, nor the FDMA
> > engine is reset, so they remain in sync.
> >
> > When the lan966x_pci module is removed and reloaded (what you did), the DT
> > overlay is re-applied, which causes the reset controller
> > (reset-microchip-sparx5) to re-probe. During probe, it performs a GCB soft reset
> > that resets the switch core, but protects the CPU domain from the reset. The
> > FDMA engine is part of the CPU domain, so it is not reset.
> >
> > This leaves the switch core in a reset state while the FDMA
> > retains state from the previous driver instance. When the switch driver
> > subsequently probes and activates the FDMA channels, the two are out of
> > sync, and the FDMA immediately reports extraction errors.
> >
> > Theres actually an FDMA register called NRESET that resets the FDMA controller
> > state. Calling this in the FDMA init path causes traffic to work correctly on
> > lan966x_pci reload, but it does not get rid of the FDMA splats you posted above.
> > They get queued up between the switch core reset, in the reset controller, and
> > the FDMA enabling. I tried different approaches to drain or flush queues, but
> > they wont go away entirely.
> >
> > The only thing that seems to work consistently is to *not* do the soft reset in
> > the reset controller for the PCI path. The soft reset is actually the problem:
> > it only resets the switch core while protecting the CPU domain (including FDMA),
> > causing a desync.
> >
> > A simple fix could be (in reset-microchip-sparx5.c):
> >
> > +static bool mchp_reset_is_pci(struct device *dev)
> > +{
> > + for (dev = dev->parent; dev; dev = dev->parent) {
> > + if (dev_is_pci(dev))
> > + return true;
> > + }
> > + return false;
> > +}
> >
> > - /* Issue the reset very early, our actual reset callback is a noop. */
> > - err = sparx5_switch_reset(ctx);
> > - if (err)
> > - return err;
> > + /* Issue the reset very early, our actual reset callback is a noop.
> > + *
> > + * On the PCI path, skip the reset. The endpoint is already in
> > + * power-on reset state on the first probe. On subsequent probes
> > + * (after driver reload), resetting the switch core while the FDMA
> > + * retains state (CPU domain is protected from the soft reset)
> > + * causes the two to go out of sync, leading to FDMA extraction
> > + * errors.
> > + */
> > + if (!mchp_reset_is_pci(&pdev->dev)) {
> > + err = sparx5_switch_reset(ctx);
> > + if (err)
> > + return err;
> > + }
> >
> > Could you test it and see if it helps the problem on your side.
> >
>
> I have tested it on my ARM and x86 system. It fixes the lan966x_pci module
> unloading / reloading issue.
>
> However an other regression is present. After a reboot, without power
> off/on, the board is not working (tested on both my ARM and x86 systems).
>
> According to your explanation, this makes sense.
>
> IMHO, the problem is that we cannot make the assumption that "The endpoint
> is already in power-on reset state on the first probe". That's not true
> when you just call the reboot command.
>
> Best regards,
> Hervé

The following diff should fix the FDMA traffic issue, and the FDMA error splat,
when reloading the lan966x-pci driver, by:

1. Resetting the FDMA engine on PCI init()

2. Clearing any rogue FDMA errors that may latch due to the soft reset by the
reset driver.

diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c
b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma_pci.c
@@ -372,6 +372,9 @@ static int lan966x_fdma_pci_init(struct lan966x *lan966x)
if (!lan966x->fdma)
return 0;

+ lan_wr(FDMA_CTRL_NRESET_SET(0), lan966x, FDMA_CTRL);
+ lan_wr(FDMA_CTRL_NRESET_SET(1), lan966x, FDMA_CTRL);
+
fdma_pci_atu_init(&lan966x->atu, lan966x->regs[TARGET_PCIE_DBI]);

lan966x->rx.lan966x = lan966x;
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c
@@ -1071,6 +1071,15 @@ static int lan966x_reset_switch(struct lan966x *lan966x)

reset_control_reset(switch_reset);

+ /* When in PCI mode, the GCB soft reset issued by the reset
+ * controller can latch spurious bits in the FDMA error stickies.
+ * Clear them before request_irq hooks up the FDMA IRQ line,
+ * otherwise the handler fires immediately on probe.
+ */
+ lan_wr(lan_rd(lan966x, FDMA_ERRORS), lan966x, FDMA_ERRORS);
+ lan_wr(lan_rd(lan966x, FDMA_INTR_ERR), lan966x, FDMA_INTR_ERR);
+ lan_wr(lan_rd(lan966x, FDMA_INTR_DB), lan966x, FDMA_INTR_DB);
+
/* Don't reinitialize the switch core, if it is already initialized. In
* case it is initialized twice, some pointers inside the queue system
* in HW will get corrupted and then after a while the queue system gets
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h
b/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h
@@ -1010,6 +1010,15 @@ enum lan966x_target {
#define FDMA_CH_CFG_CH_MEM_GET(x)\
FIELD_GET(FDMA_CH_CFG_CH_MEM, x)

+/* FDMA:FDMA:FDMA_CTRL */
+#define FDMA_CTRL __REG(TARGET_FDMA, 0, 1, 8, 0, 1, 428, 424, 0, 1, 4)
+
+#define FDMA_CTRL_NRESET BIT(0)
+#define FDMA_CTRL_NRESET_SET(x)\
+ FIELD_PREP(FDMA_CTRL_NRESET, x)
+#define FDMA_CTRL_NRESET_GET(x)\
+ FIELD_GET(FDMA_CTRL_NRESET, x)
+
/* FDMA:FDMA:FDMA_PORT_CTRL */
#define FDMA_PORT_CTRL(r) __REG(TARGET_FDMA, 0, 1, 8, 0, 1, 428, 376, r, 2, 4)

Let me know if it works on your end.

(Btw. I have noticed another issue where TX stops working on lan966x-pci reload.
It happens more rarely, but is unrelated to this patch series, as it also
happens in register-based INJ/XTR mode. Whenever that happens, you will see
"Flush timeout chip port" in the logs. This should also be fixed, but sent as a
separate fix commit, I believe.)

/Daniel