Re: [PATCH RFC] iommu/dma: Validate page before accessing P2PDMA state
From: Robin Murphy
Date: Fri Feb 27 2026 - 09:13:59 EST
On 2026-02-27 5:46 am, Ashish Mhetre wrote:
On 2/26/2026 1:28 PM, Leon Romanovsky wrote:
External email: Use caution opening links or attachments
On Wed, Feb 25, 2026 at 08:11:29PM +0000, Pranjal Shrivastava wrote:
On Wed, Feb 25, 2026 at 09:56:09AM +0200, Leon Romanovsky wrote:Before dma_map_phys() was added, there was no reliable way to DMA‑map
On Wed, Feb 25, 2026 at 10:19:41AM +0530, Ashish Mhetre wrote:Yup, I meant the latter.
The latter one.
On 2/25/2026 2:27 AM, Pranjal Shrivastava wrote:
External email: Use caution opening links or attachmentsThanks Leon for the review. This crash started after commit 30280eee2db1
On Tue, Feb 24, 2026 at 02:32:21PM +0200, Leon Romanovsky wrote:
On Tue, Feb 24, 2026 at 10:42:57AM +0000, Ashish Mhetre wrote:
When mapping scatter-gather entries that reference reservedI believe this behavior started after commit 88df6ab2f34b
memory regions without struct page backing (e.g., bootloader created
carveouts), is_pci_p2pdma_page() dereferences the page pointer
returned by sg_page() without first verifying its validity.
("mm: add folio_is_pci_p2pdma()"). Prior to that change, the
is_zone_device_page(page) check would return false when given a
non‑existent page pointer.
("iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg").
Doesn't folio_is_pci_p2pdma() also check for zone device?Yes, this will also fix the crash.
I see[1] that it does:
static inline bool folio_is_pci_p2pdma(const struct folio *folio)
{
return IS_ENABLED(CONFIG_PCI_P2PDMA) &&
folio_is_zone_device(folio) &&
folio->pgmap->type == MEMORY_DEVICE_PCI_P2PDMA;
}
I believe the problem arises due to the page_folio() call in
folio_is_pci_p2pdma(page_folio(page)); within is_pci_p2pdma_page().
page_folio() assumes it has a valid struct page to work with. For these
carveouts, that isn't true.
Potentially something like the following would stop the crash:
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index e3c2ccf872a8..e47876021afa 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -197,7 +197,8 @@ static inline void folio_set_zone_device_data(struct folio *folio, void *data)
static inline bool is_pci_p2pdma_page(const struct page *page)
{
- return IS_ENABLED(CONFIG_PCI_P2PDMA) &&
+ return IS_ENABLED(CONFIG_PCI_P2PDMA) && page &&
+ pfn_valid(page_to_pfn(page)) &&
folio_is_pci_p2pdma(page_folio(page));
}
But my broader question is: why are we calling a page-based API likeThanks for the feedback, Pranjal.
is_pci_p2pdma_page() on non-struct-page memory in the first place?
Could we instead add a helper to verify if the sg_page() return value
is actually backed by a struct page? If it isn't, we should arguably
skip the P2PDMA logic entirely and fall back to a dma_map_phys style
path. Isn't handling these "pageless" physical ranges the primary reason
dma_map_phys exists?
To clarify: are you suggesting we handle non-page-backed mappings inside
iommu_dma_map_sg (within dma-iommu), or that callers should detect
non-page-backed memory and use dma_map_phys instead of dma_map_sg?
Yes, the thing is, if the caller already knows that the region to beFormer approach sounds better so that existing iommu_dma_map_sg callersThe bug is in callers which used wrong API, they need to be adapted.
don't need changes, but I'd like to confirm your preference.
mapped is NOT struct page-backed, then why does it use dma_map_sg
variants?
such memory, and using dma_map_sg() was a workaround that happened to
work. I'm not sure whether it worked by design or by accident, but the
correct approach now is to use dma_map_phys().
Thanks Leon and Pranjal for the detailed feedback. I'll update our callers to use
dma_map_phys() for non-page-backed buffers.
One question: would it make sense to add a check in iommu_dma_map_sg to
fail gracefully when non-page-backed buffers are passed, instead of crashing
the kernel?
No, it is the responsibility of drivers not to abuse kernel APIs inappropriately. Checking for misuse adds overhead that penalises correct users. dma_map_page/sg on non-page-backed memory has never been valid, and it would only have been system-configuration-dependent luck that it wasn't already blowing up before. I guess dma-debug could add additional checks on these APIs similarly to debug_dma_map_single(), but the fact that we've never even considered checking for made-up bogus struct page pointers only goes to show just how wrong a thing to do it is.
Thanks,
Robin.