Re: [RFC PATCH 0/6] Deep talk about folio vmap

From: Vishal Moola (Oracle)
Date: Fri Mar 28 2025 - 17:09:35 EST


On Thu, Mar 27, 2025 at 05:28:27PM +0800, Huan Yang wrote:
> Bingbu reported an issue in [1] that udmabuf vmap failed and in [2], we
> discussed the scenario of folio vmap due to the misuse of vmap_pfn
> in udmabuf.
>
> We reached the conclusion that vmap_pfn prohibits the use of page-based
> PFNs:
> Christoph Hellwig : 'No, vmap_pfn is entirely for memory not backed by
> pages or folios, i.e. PCIe BARs and similar memory. This must not be
> mixed with proper folio backed memory.'
>
> But udmabuf still need consider HVO based folio's vmap, and need fix
> vmap issue. This RFC code want to show the two point that I mentioned
> in [2], and more deep talk it:
>
> Point1. simple copy vmap_pfn code, don't bother common vmap_pfn, use by
> itself and remove pfn_valid check.
>
> Point2. implement folio array based vmap(vmap_folios), which can given a
> range of each folio(offset, nr_pages), so can suit HVO folio's vmap.
>
> Patch 1-2 implement point1, and add a test simple set in udmabuf driver.
> Patch 3-5 implement point2, also can test it.
>
> Kasireddy also show that 'another option is to just limit udmabuf's vmap()
> to only shmem folios'(This I guess folio_test_hugetlb_vmemmap_optimized
> can help.)
>
> But I prefer point2 to solution this issue, and IMO, folio based vmap still
> need.
>
> Compare to page based vmap(or pfn based), we need split each large folio
> into single page struct, this need more large array struct and more longer
> iter. If each tail page struct not exist(like HVO), can only use pfn vmap,
> but there are no common api to do this.
>
> In [2], we talked that udmabuf can use hugetlb as the memory
> provider, and can give a range use. So if HVO used in hugetlb, each folio's
> tail page may freed, so we can't use page based vmap, only can use pfn
> based, which show in point1.
>
> Further more, Folio based vmap only need record each folio(and offset,
> nr_pages if range need). For 20MB vmap, page based need 5120 pages(40KB),
> 2MB folios only need 10 folio(80Byte).
>
> Matthew show that Vishal also offered a folio based vmap - vmap_file[3].
> This RFC patch want a range based folio, not only a full folio's map(like
> file's folio), to resolve some problem like HVO's range folio vmap.

Hmmm, I should've been more communicative, sorry about that. V1 was
poorly implemented, and I've had a V2 sitting around that does Exactly
what you want.

I'll send V2 to the mailing list and you can take a look at it;
preferrably you integrate that into this patchset instead (it would
make both the udma and vmalloc code much neater).

> Please give me more suggestion.
>
> Test case:
> //enable/disable HVO
> 1. echo [1|0] > /proc/sys/vm/hugetlb_optimize_vmemmap
> //prepare HUGETLB
> 2. echo 10 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> 3. ./udmabuf_vmap
> 4. check output, and dmesg if any warn.
>
> [1] https://lore.kernel.org/all/9172a601-c360-0d5b-ba1b-33deba430455@xxxxxxxxxxxxxxx/
> [2] https://lore.kernel.org/lkml/20250312061513.1126496-1-link@xxxxxxxx/
> [3] https://lore.kernel.org/linux-mm/20250131001806.92349-1-vishal.moola@xxxxxxxxx/
>
> Huan Yang (6):
> udmabuf: try fix udmabuf vmap
> udmabuf: try udmabuf vmap test
> mm/vmalloc: try add vmap folios range
> udmabuf: use vmap_range_folios
> udmabuf: vmap test suit for pages and pfns compare
> udmabuf: remove no need code
>
> drivers/dma-buf/udmabuf.c | 29 +++++++++-----------
> include/linux/vmalloc.h | 57 +++++++++++++++++++++++++++++++++++++++
> mm/vmalloc.c | 47 ++++++++++++++++++++++++++++++++
> 3 files changed, 117 insertions(+), 16 deletions(-)
>
> --
> 2.48.1
>