Re: [PATCH] net: lan966x: restore XDP RXQ page pool on reload failure
From: Guangshuo Li
Date: Sun Jun 07 2026 - 10:57:54 EST
Hi David,
Thank you for the review.
On Thu, 4 Jun 2026 at 21:46, David CARLIER <devnexen@xxxxxxxxx> wrote:
>
> On Thu, 4 Jun 2026 at 12:31, Guangshuo Li <lgs201920130244@xxxxxxxxx> wrote:
> >
> > lan966x_fdma_rx_alloc_page_pool() registers the newly allocated page pool
> > with each port's XDP RXQ. During an FDMA reload, this registration is
> > performed before fdma_alloc_coherent() is called.
> >
> > If fdma_alloc_coherent() fails, lan966x_fdma_rx_alloc() destroys the new
> > page pool and returns an error. The reload rollback path restores
> > rx->page_pool and rx->fdma to their old values, but it does not restore
> > the per-port XDP RXQ mem model registration. As a result, the XDP RXQ
> > state can remain associated with the page pool from the failed allocation
> > attempt while the RX state has been rolled back to the old page pool.
> >
> > Restore the old page pool registration for each port's XDP RXQ in the
> > reload rollback path before RX is started again, so the XDP RXQ state
> > matches the restored RX state.
> >
> > Fixes: 59c3d55a946c ("net: lan966x: fix use-after-free and leak in lan966x_fdma_reload()")
> > Signed-off-by: Guangshuo Li <lgs201920130244@xxxxxxxxx>
> > ---
> > .../net/ethernet/microchip/lan966x/lan966x_fdma.c | 15 +++++++++++++++
> > 1 file changed, 15 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> > index f8ce735a7fc0..76654b44baf2 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> > @@ -855,6 +855,21 @@ static int lan966x_fdma_reload(struct lan966x *lan966x, int new_mtu)
> > restore:
> > lan966x->rx.page_pool = page_pool;
> > memcpy(&lan966x->rx.fdma, &fdma_rx_old, sizeof(struct fdma));
> > + /*
> > + * lan966x_fdma_rx_alloc_page_pool() registered the new pool with
> > + * each port's XDP RXQ before the allocation failed. The new pool is
> > + * destroyed by lan966x_fdma_rx_alloc(), so restore the old pool's
> > + * registration before restarting RX.
> > + */
> > + for (int i = 0; i < lan966x->num_phys_ports; i++) {
> > + if (!lan966x->ports[i])
> > + continue;
> > +
> > + xdp_rxq_info_unreg_mem_model(&lan966x->ports[i]->xdp_rxq);
> > + xdp_rxq_info_reg_mem_model(&lan966x->ports[i]->xdp_rxq,
> > + MEM_TYPE_PAGE_POOL, page_pool);
> > + }
> > +
> > lan966x_fdma_rx_start(&lan966x->rx);
> >
> > lan966x_fdma_wakeup_netdev(lan966x);
> > --
> > 2.43.0
> >
>
> Hi Guangshuo,
>
> Just one remark. The re-registration fix looks right, but the same
> restore path has a related gap.
>
> Just above the failing alloc we set:
>
> lan966x->rx.page_order = round_up(new_mtu, PAGE_SIZE) / PAGE_SIZE - 1;
> lan966x->rx.max_mtu = new_mtu;
>
> These aren't rolled back in restore:, so after a failed reload
> page_order/max_mtu are new while pages/fdma/page_pool are old.
> lan966x_xdp_run() then calls
> xdp_init_buff() with PAGE_SIZE << page_order, advertising a
> too-large frame_sz — for a jumbo MTU that bumps page_order,
> bpf_xdp_adjust_tail() could grow past the real
> page.
>
> Could you save and restore the old page_order/max_mtu too, so the
> restored RX state is fully consistent?
>
> Cheers.
Yes, that makes sense. I missed that page_order and max_mtu are also
updated before the failing allocation and need to be restored together
with the old RX state.
I will save and restore the old page_order/max_mtu in the reload rollback
path, and send a v2 with that fixed.
Thanks,
Guangshuo