Re: [RFC PATCH v2 0/1] Introduce vmap_file()

From: Huan Yang
Date: Tue Apr 01 2025 - 02:09:21 EST



在 2025/4/1 11:19, Vishal Moola (Oracle) 写道:
On Tue, Apr 01, 2025 at 10:21:46AM +0800, Huan Yang wrote:
在 2025/4/1 09:50, Vishal Moola (Oracle) 写道:
On Mon, Mar 31, 2025 at 10:05:53AM +0800, Huan Yang wrote:
HI Vishal,

在 2025/3/29 05:13, Vishal Moola (Oracle) 写道:
Currently, users have to call vmap() or vmap_pfn() to map pages to
kernel virtual space. vmap_pfn() is for special pages (i.e. pfns
without struct page). vmap() handles normal pages.

With large folios, we may want to map ranges that only span
part of a folio (i.e. mapping half of a 2Mb folio).
vmap_file() will allow us to do so.
You mention vmap_file can support range folio vmap, but when I look code, I can't figure out

how to use, maybe I missed something? :)
I took a look at the udma-buf code. Rather than iterating through the
folios using pfns, you can calculate the corresponding file offsets
(maybe you already have them?) to map the desired folios.
Currently udmabuf folio's not simple based on file(even each memory from memfd). User can provide

random range of memfd  to udmabuf to use. For example:

We get a memfd maybe 4M, user split it into [0, 2M), [1M, 2M), [2M, 4M), so you can see 1M-2M range repeat.

This range can gathered by udmabuf_create_list, then udmabuf use it. So, udmabuf record it by folio array+offset array.
I was thinking you could call vmap_file() on every sub-range and use
those addresses. It should work, we'd have to look at making udmabuf api's
support it.

Hmmm, how to get contigous virtual address? Or there are a way to merge each split vmap's return address?

IMO, user invoke vmap want to map each scatter memory into contigous virtual address, but as your suggestion,

I think can't to this. :)


I think vmap_file based on address_space's range can't help.
I'm not familiar with the memfd/gup code yet, but I'm fairly confident
those memfds will have associated ->f_mappings that would suffice. They
are file descriptors after all.
Agree with this.

And this API still aim to file vmap, Maybe not suitable for the problem I mentioned in:

https://lore.kernel.org/lkml/20250312061513.1126496-1-link@xxxxxxxx/
I'm not sure which problem you're referring to, could you be more
specific?
1. udmabuf not same to file vmap usage

2. udmabuf can't use page struct if HVO hugetlb enabled and use.
vmap_file() doesn't depend on tail page structs.

It still need pfn based vmap or folio's offset based range vmap.(Or, just simple reject HVO folio use vmap) :)

Thanks,
Huan Yang

Create a function, vmap_file(), to map a specified range of a given
file to kernel virtual space. vmap_file() is an in-kernel equivalent
to mmap(), and can be useful for filesystems.

---
v2:
- Reword cover letter to provide a clearer overview of the current
vmalloc APIs, and usefulness of vmap_file()
- EXPORT_SYMBOL_GPL() instead of EXPORT_SYMBOL()
- Provide support to partially map file folios
- Demote this to RFC while we look for users
--
I don't have a user for this function right now, but it will be
useful as users start converting to using large folios. I'm just
putting it out here for anyone that may find a use for it.

This seems like the sensible way to implement it, but I'm open
to tweaking the functions semantics.

I've Cc-ed a couple people that mentioned they might be interested
in using it.

Vishal Moola (Oracle) (1):
mm/vmalloc: Introduce vmap_file()

include/linux/vmalloc.h | 2 +
mm/vmalloc.c | 113 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 115 insertions(+)