Re: Linux: DMA-after-unmap race in ZCRX via netif_rxq_cleanup_unlease() ordering inversion (netkit + page_pool)
From: Jakub Kicinski
Date: Wed May 27 2026 - 19:33:51 EST
Dropping security lists, security lists are for private discussions,
it's utterly pointless to CC both them and LKML. Not to mention
that this bug only exists in -rc kernels.
Adding relevant developers. Moving security@ to Bcc
On Wed, 27 May 2026 23:53:45 +0100 Prénom? Ahmed wrote:
> Hello,
>
> I would like to report a source-proven teardown ordering bug in the Linux
> kernel that can lead to a DMA-after-unmap race condition involving ZCRX
> (io_uring zero-copy receive), page_pool, and netkit queue leasing.
>
> ***Reporter:** Ahmed Abdelmoemen **Discovery Date:** 2026-05-26 **Kernel
> Version:** Linux 7.1.0-rc3*
>
> Executive Summary
>
> *A logic error in `netif_rxq_cleanup_unlease()` causes DMA mappings for the
> ZCRX memory provider to be revoked **before** the physical NIC RX queue is
> stopped. This creates a race window during netkit queue lease teardown
> where the physical device's NAPI can consume stale `net_iov` entries from
> the page_pool alloc cache containing `dma_addr = 0`.*
>
> The ordering inversion is fully proven at the source level. However, I have
> **not** performed runtime verification, so actual memory corruption or
> successful DMA to address 0 has **not** been proven — it remains hardware
> and driver dependent.
>
> The bug is reachable with `CAP_NET_ADMIN` (common in container
> environments) when using netkit with ZCRX.
>
> Root Cause
>
> In `net/core/netdev_rx_queue.c:347-348`:
>
> ```c __netif_mp_uninstall_rxq(virt_rxq, p); // DMA unmap + dma_addr=0
> __netif_mp_close_rxq(...); // queue stop + NAPI disable (TOO LATE)
>
> This inverts the correct ordering used in normal device unregistration and
> io_uring close paths (stop first, then unmap).
> Impact
>
> - *Potential:* NIC DMA write to physical address 0 (or stale mappings
> with lazy IOMMU) leading to memory corruption.
> - *Requirements:* CAP_NET_ADMIN + netkit queue leasing + ZCRX installed
> on the leased queue.
> - *Current Status:* No runtime PoC or crash reproduction yet. The race
> window exists in theory but its practical exploitability needs confirmation.
>
> I am attaching the full detailed analysis.
> Proposed Fix[image: image.png]
>
> I am happy to provide more details or assist with testing.