Re: [PATCH] net: lan966x: restore XDP RXQ page pool on reload failure
From: David CARLIER
Date: Thu Jun 04 2026 - 10:00:45 EST
On Thu, 4 Jun 2026 at 12:31, Guangshuo Li <lgs201920130244@xxxxxxxxx> wrote:
>
> lan966x_fdma_rx_alloc_page_pool() registers the newly allocated page pool
> with each port's XDP RXQ. During an FDMA reload, this registration is
> performed before fdma_alloc_coherent() is called.
>
> If fdma_alloc_coherent() fails, lan966x_fdma_rx_alloc() destroys the new
> page pool and returns an error. The reload rollback path restores
> rx->page_pool and rx->fdma to their old values, but it does not restore
> the per-port XDP RXQ mem model registration. As a result, the XDP RXQ
> state can remain associated with the page pool from the failed allocation
> attempt while the RX state has been rolled back to the old page pool.
>
> Restore the old page pool registration for each port's XDP RXQ in the
> reload rollback path before RX is started again, so the XDP RXQ state
> matches the restored RX state.
>
> Fixes: 59c3d55a946c ("net: lan966x: fix use-after-free and leak in lan966x_fdma_reload()")
> Signed-off-by: Guangshuo Li <lgs201920130244@xxxxxxxxx>
> ---
> .../net/ethernet/microchip/lan966x/lan966x_fdma.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> index f8ce735a7fc0..76654b44baf2 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> @@ -855,6 +855,21 @@ static int lan966x_fdma_reload(struct lan966x *lan966x, int new_mtu)
> restore:
> lan966x->rx.page_pool = page_pool;
> memcpy(&lan966x->rx.fdma, &fdma_rx_old, sizeof(struct fdma));
> + /*
> + * lan966x_fdma_rx_alloc_page_pool() registered the new pool with
> + * each port's XDP RXQ before the allocation failed. The new pool is
> + * destroyed by lan966x_fdma_rx_alloc(), so restore the old pool's
> + * registration before restarting RX.
> + */
> + for (int i = 0; i < lan966x->num_phys_ports; i++) {
> + if (!lan966x->ports[i])
> + continue;
> +
> + xdp_rxq_info_unreg_mem_model(&lan966x->ports[i]->xdp_rxq);
> + xdp_rxq_info_reg_mem_model(&lan966x->ports[i]->xdp_rxq,
> + MEM_TYPE_PAGE_POOL, page_pool);
> + }
> +
> lan966x_fdma_rx_start(&lan966x->rx);
>
> lan966x_fdma_wakeup_netdev(lan966x);
> --
> 2.43.0
>
Hi Guangshuo,
Just one remark. The re-registration fix looks right, but the same
restore path has a related gap.
Just above the failing alloc we set:
lan966x->rx.page_order = round_up(new_mtu, PAGE_SIZE) / PAGE_SIZE - 1;
lan966x->rx.max_mtu = new_mtu;
These aren't rolled back in restore:, so after a failed reload
page_order/max_mtu are new while pages/fdma/page_pool are old.
lan966x_xdp_run() then calls
xdp_init_buff() with PAGE_SIZE << page_order, advertising a
too-large frame_sz — for a jumbo MTU that bumps page_order,
bpf_xdp_adjust_tail() could grow past the real
page.
Could you save and restore the old page_order/max_mtu too, so the
restored RX state is fully consistent?
Cheers.