Re: [RFC v1] io_uring/rsrc: add fast path huge page handling in buffer registration
From: Matthew Wilcox
Date: Wed Jun 10 2026 - 14:14:23 EST
On Wed, Jun 10, 2026 at 03:18:52PM +0200, David Hildenbrand (Arm) wrote:
> On 6/10/26 13:34, Christoph Hellwig wrote:
> > On Wed, Jun 10, 2026 at 11:54:01AM +0200, David Hildenbrand (Arm) wrote:
> >> There are some long-term plans on providing an interface that would abstract how
> >> you refcount something you GUP'ed. (because, some pages we GUP in the future
> >> might not even have a dedicated refcount, all still fairly unclear). But it's
> >> all not really finalized I think.
> >>
> >> For now, we could expose a folio+page/offset+nr_pages interface, where we,
> >> long-term, would not be able to return non-folio pages (e.g., vm_insert_page())
> >> and would instead, in the future, fail the request if we stumble over a
> >> non-folio thing in the page tables. That sounds reasonable for now.
> >
> > I think whatever we're going to use for direct I/O has to also support
> > non-folio pages, especially PCI P2P memory. So coming up with an
> > interface that support this ASAP would be helpful.
>
> Yes.
>
> I think we can keep returning pages as long a the unpin interface knows the
> right thing to do to unpin them.
This would be the get_user_phyrs() interface I've talked about before.
https://lore.kernel.org/all/ZbVO2RKhw-dLUMvf@xxxxxxxxxxxxxxxxxxxx/
and the long thread:
https://lore.kernel.org/all/YdyKWeU0HTv8m7wD@xxxxxxxxxxxxxxxxxxxx/
> Would there be users for a new interface that returns page ranges as described
> above, that would want to still unpin stuff partially? E.g., we give them a page
> range that belongs to the same folio with only a single pin/reference, but they
> would want to logically split that range and unpin pages individually?
Urgh, no, we shouldn't do that. ranges should be pinned / unpinned
as a whole. I'm sympathetic to "for this special operation we need to
create a new range from this existing range and adjust the refcount(s)
appropriately so each of the two rangees can be put separately", but
I'm not sympathetic to "we need to allow each page to be individually
refcounted".