Re: [PATCH v2 6/7] nfs: Optimize direct I/O to use folios for requests
From: Pranjal Shrivastava
Date: Tue Jun 16 2026 - 13:24:39 EST
On Tue, Jun 16, 2026 at 11:29:13AM -0400, Trond Myklebust wrote:
Hi Trond
> On Tue, 2026-06-16 at 13:39 +0000, Pranjal Shrivastava wrote:
> > Optimize nfs_direct_extract_pages() to group contiguous pages from
> > the
> > same folio into single nfs_page structures. This effectively migrates
> > NFS Direct I/O from being page-based to being folio-based.
> >
> > Reduce the number of nfs_page allocations and subsequent iterations
> > by utilizing nfs_page_create_from_folio() to create aggregated
> > requests.
> >
> > Signed-off-by: Pranjal Shrivastava <praan@xxxxxxxxxx>
> > ---
> > fs/nfs/direct.c | 47 +++++++++++++++++++++++++++++++++++++----------
> > 1 file changed, 37 insertions(+), 10 deletions(-)
> >
> > diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
> > index e2a93cfb6c72..ddc6b27f5315 100644
> > --- a/fs/nfs/direct.c
> > +++ b/fs/nfs/direct.c
> > @@ -194,23 +194,45 @@ static ssize_t nfs_direct_extract_pages(struct
> > nfs_direct_req *dreq,
> > return result;
> >
> > npages = (result + pgbase + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > - for (i = 0; i < npages; i++) {
> > + for (i = 0; i < npages; ) {
> > + unsigned int chunk_len, folio_offset;
> > + unsigned int nr_to_add = 1;
> > struct nfs_page *req;
> > - unsigned int req_len = min_t(size_t, result - bytes,
> > PAGE_SIZE - pgbase);
> > + struct folio *folio;
> >
> > - req = nfs_page_create_from_page(dreq->ctx,
> > pagevec[i],
> > - pinned, pgbase,
> > *pos,
> > - req_len);
> > + folio = page_folio(pagevec[i]);
>
> I'm clearly missing something. The memory pointed to by these pages can
> be any arbitrary user space (or kernel space) memory region. It could
> be mapped device memory, for instance.
>
> So why can you assume that page_folio() will resolve to a valid folio
> here?
AFAIU, the MM subsystem explicitly ensures that every valid struct page
is part of a folio. The documentation for page_folio() explicitly
states [1]:
"Every page is part of a folio. This function cannot be called on a
NULL pointer."
Since iov_iter_extract_pages() only returns pages that are successfully
pinned and tracked by the kernel, we are guaranteed that pagevec[i]
points to a valid struct page and thus a valid folio.
Regarding device-mapped memory, ZONE_DEVICE pages have also been
refactored to support folios recently (e.g. free_zone_device_folio() [2])
If the memory is not part of a large compound page, page_folio() simply
returns the struct page pointer cast to a struct folio * [3]. In this
case, the folio size is effectively 1, and our extraction loop correctly
handles it as a single-page request unless it identifies physical
contiguity within the same folio.
The only other thing to take care was folio_split which applies
specifically when the caller does not hold a reference on the page.
However, in our case (NFS) the iov_iter_extract_pages() has already
pinned the folio via GUP by this point which ensures that the folio
cannot be split or freed under us, making the page_folio() call and the
subsequent aggregation logic safe.
Finally, in cases where device memory is NOT backed by struct page
(e.g. dmabuf or PFN-based mappings via remap_pfn_range), the buffers
are already unsupported for NFS Direct I/O today. The underlying page
pinning (GUP) would fail with -EFAULT in check_vma_flags() [4] even
before reaching this point.
Given the above guarantees by the kernel, we can ensure that this
resolves to a valid folio at this point in the file-system.
Thanks,
Praan
[1] https://elixir.bootlin.com/linux/v7.1-rc6/source/include/linux/page-flags.h#L291
[2] https://elixir.bootlin.com/linux/v7.1-rc6/source/mm/memremap.c#L416
[3] https://elixir.bootlin.com/linux/v7.1-rc6/source/include/linux/page-flags.h#L234
[4] https://elixir.bootlin.com/linux/v7.1-rc6/source/mm/gup.c#L1208