Re: [PATCH net-next 2/4] udmabuf: emit one sg entry per pinned folio
From: Bobby Eshleman
Date: Tue Jun 09 2026 - 11:27:36 EST
On Mon, Jun 08, 2026 at 03:59:04PM +0200, Christian König wrote:
> On 6/8/26 15:55, Bobby Eshleman wrote:
> >
> > On Sun, Jun 7, 2026 at 11:42 PM Christian König <christian.koenig@xxxxxxx <mailto:christian.koenig@xxxxxxx>> wrote:
> >
> > On 6/5/26 20:44, Bobby Eshleman wrote:
> > > On Fri, Jun 05, 2026 at 11:30:07AM +0200, Christian König wrote:
> > >> On 6/4/26 02:42, Bobby Eshleman wrote:
> > >>> From: Bobby Eshleman <bobbyeshleman@xxxxxxxx <mailto:bobbyeshleman@xxxxxxxx>>
> > >>>
> > >>> get_sg_table() emitted one PAGE_SIZE sg entry per page even when the
> > >>> underlying folio was larger.
> > >>>
> > >>> Instead, walk folios[] and emit one sg entry per folio. When folios
> > >>> represent large pages (as is for MFD_HUGETLB), each sg entry is a large
> > >>> page. Normal PAGE_SIZE sg tables are unchanged.
> > >>>
> > >>> Required by net/core/devmem to support rx-buf-size > PAGE_SIZE with
> > >>> udmabuf.
> > >>
> > >> That doesn't explain why this is required.
> > >
> > > Sure, can definitely add. Devmem currently requires dmabuf sg entries to
> > > be length and size aligned when it allocates niovs for NIC page pools.
> > > Though udmabuf is not violating any dmabuf contract by emitting
> > > PAGE_SIZE entries and the above restriction is probably more a
> > > shortfalling of devmem, by emitting a single entry per folio this patch
> > > allows udmabuf to be used by devmem for large pages.
> > >
> > >>
> > >> Please note that accessing the pages/folio of an sg-table returned by DMA-buf is illegal and strictly forbidden!
> > >>
> > >> Regards,
> > >> Christian.
> > >
> > > It seems both devmem and io_uring zcrx at least introspect through to
> > > the sg-table to build NIC page pools (not accessing the memory itself,
> > > however). Is there a better way?
> >
> > That's an absolute NO-GO! We need to stop that immediately.
> >
> > Touching the underlying struct page of an DMA-buf exported sg-table is strictly forbidden.
> >
> > We even have code to wrap the sg_table and hide the struct pages on debug builds to catch those issues, see function dma_buf_wrap_sg_table().
> >
> > My last status is that the NIC page pools are build directly from the DMA addresses exposed by the sg_table.
> >
> > Was there any change I'm not aware of?
> >
> > Regards,
> > Christian.
> >
> >
> > Oh no change, your mental model is still current.
> > They just go through each sg and use sg_dma_address() on each.
>
> Ah, thanks! That was a near heart attack :D
>
> Yeah that is perfectly correct, question is do you then still really need this udmabuf change? I mean the DMA API usually merges together contiguous DMA addresses.
>
> Regards,
> Christian.
>
Hey Christian, sorry for the delay I justed want to double check what
I'm seeing...
I reverted the udmabuf patch and confirmed devmem still runs into 4K
pages even for hugepage udmabuf. I see that the dma_map_direct() path is
being taken, which if I am reading the code correctly results in the
sg_dma_len(sg) inheriting sg->length directly (set by udmabuf's
sg_set_folio(..., PAGE_SIZE) call), compared to the iommu_dma_map_phys()
path which looks like it does merge when possible.
Best,
Bobby