Re: Linux: DMA-after-unmap race in ZCRX via netif_rxq_cleanup_unlease() ordering inversion (netkit + page_pool)

From: Daniel Borkmann

Date: Fri May 29 2026 - 03:59:02 EST


On 5/29/26 2:56 AM, Prénom? Ahmed wrote:
Hi Jakub,

The patch looks correct, the reordering and stack copy approach makes sense.

One thing I forgot to mention in my original report: I did trace the teardown path at runtime via ftrace/kprobes and confirmed the race window opens. I was not able to win it though — no actual DMA write was triggered, so exploitability remains hardware dependent.

Could you please add a Reported-by tag in the final commit :)?

  Reported-by: Ahmed Abdelmoemen <ahmedabdelmoumen05@xxxxxxxxx <mailto:ahmedabdelmoumen05@xxxxxxxxx>>

Yeap, will add the Reported-by, thanks!

On Thu, May 28, 2026, 23:28 Daniel Borkmann <daniel@xxxxxxxxxxxxx <mailto:daniel@xxxxxxxxxxxxx>> wrote:

Hi Ahmed,

On 5/28/26 1:33 AM, Jakub Kicinski wrote:
> Dropping security lists, security lists are for private discussions,
> it's utterly pointless to CC both them and LKML. Not to mention
> that this bug only exists in -rc kernels.
>
> Adding relevant developers. Moving security@ to Bcc

Thanks for the report! I think a fix could look as below. Before submitting,
I would prefer though if David could check this against real HW supporting
mem providers e.g. BCM NIC:

diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c
index de4dac4c88b3..00a7011eb4d5 100644
--- a/net/core/netdev_rx_queue.c
+++ b/net/core/netdev_rx_queue.c
@@ -338,12 +338,12 @@ void __netif_mp_uninstall_rxq(struct netdev_rx_queue *rxq,
  void netif_rxq_cleanup_unlease(struct netdev_rx_queue *phys_rxq,
                               struct netdev_rx_queue *virt_rxq)
  {
-       struct pp_memory_provider_params *p = &phys_rxq->mp_params;
        unsigned int rxq_idx = get_netdev_rx_queue_index(phys_rxq);
+       struct pp_memory_provider_params p = phys_rxq->mp_params;

-       if (!p->mp_ops)
+       if (!p.mp_ops)
                return;

-       __netif_mp_uninstall_rxq(virt_rxq, p);
-       __netif_mp_close_rxq(phys_rxq->dev, rxq_idx, p);
+       __netif_mp_close_rxq(phys_rxq->dev, rxq_idx, &p);
+       __netif_mp_uninstall_rxq(virt_rxq, &p);
  }

> On Wed, 27 May 2026 23:53:45 +0100 Prénom? Ahmed wrote:
>> Hello,
>>
>> I would like to report a source-proven teardown ordering bug in the Linux
>> kernel that can lead to a DMA-after-unmap race condition involving ZCRX
>> (io_uring zero-copy receive), page_pool, and netkit queue leasing.
>>
>> ***Reporter:** Ahmed Abdelmoemen **Discovery Date:** 2026-05-26 **Kernel
>> Version:** Linux 7.1.0-rc3*
>>
>> Executive Summary
>>
>> *A logic error in `netif_rxq_cleanup_unlease()` causes DMA mappings for the
>> ZCRX memory provider to be revoked **before** the physical NIC RX queue is
>> stopped. This creates a race window during netkit queue lease teardown
>> where the physical device's NAPI can consume stale `net_iov` entries from
>> the page_pool alloc cache containing `dma_addr = 0`.*
>>
>> The ordering inversion is fully proven at the source level. However, I have
>> **not** performed runtime verification, so actual memory corruption or
>> successful DMA to address 0 has **not** been proven — it remains hardware
>> and driver dependent.
>>
>> The bug is reachable with `CAP_NET_ADMIN` (common in container
>> environments) when using netkit with ZCRX.
>>
>> Root Cause
>>
>> In `net/core/netdev_rx_queue.c:347-348`:
>>
>> ```c __netif_mp_uninstall_rxq(virt_rxq, p); // DMA unmap + dma_addr=0
>> __netif_mp_close_rxq(...); // queue stop + NAPI disable (TOO LATE)
>>
>> This inverts the correct ordering used in normal device unregistration and
>> io_uring close paths (stop first, then unmap).
>> Impact
>>
>>     - *Potential:* NIC DMA write to physical address 0 (or stale mappings
>>     with lazy IOMMU) leading to memory corruption.
>>     - *Requirements:* CAP_NET_ADMIN + netkit queue leasing + ZCRX installed
>>     on the leased queue.
>>     - *Current Status:* No runtime PoC or crash reproduction yet. The race
>>     window exists in theory but its practical exploitability needs confirmation.
>>
>> I am attaching the full detailed analysis.
>> Proposed Fix[image: image.png]
>>
>> I am happy to provide more details or assist with testing.