Re: [PATCH v1 5/5] mm/shmem: optimize file read with folio batching

From: Baolin Wang

Date: Mon May 25 2026 - 23:12:14 EST

On 5/25/26 5:47 PM, Chi Zhiling wrote:

On 5/25/26 4:37 PM, Baolin Wang wrote:

On 5/20/26 6:15 PM, Chi Zhiling wrote:

From: Chi Zhiling <chizhiling@xxxxxxxxxx>

Optimize shmem file read by using filemap_get_folios_contig() to
batch fetch contiguous folios from the page cache, reducing the
overhead of repeated shmem_get_folio() calls.

When the folio batch is exhausted, attempt to refill it with
filemap_get_folios_contig(). If no folios are found (hole or swapped
out pages), fall back to shmem_get_folio() to handle these cases
individually.

Additionally:
- Defer folio_put() until the batch is exhausted or on exit
- Add folio_test_uptodate() check before copying to ensure data
   validity

Signed-off-by: Chi Zhiling <chizhiling@xxxxxxxxxx>
---
mm/shmem.c | 50 ++++++++++++++++++++++++++++++++++++--------------
1 file changed, 36 insertions(+), 14 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 96ea5a4c0aff..e0eacc23cccd 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3355,11 +3355,14 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
      struct file *file = iocb->ki_filp;
      struct inode *inode = file_inode(file);
      struct address_space *mapping = inode->i_mapping;
+    struct folio_batch fbatch;
      pgoff_t index;
      unsigned long offset;
      int error = 0;
      ssize_t retval = 0;
+    folio_batch_init(&fbatch);
+
      for (;;) {
          struct folio *folio = NULL;
          struct page *page = NULL;
@@ -3372,18 +3375,38 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
              break;
          index = iocb->ki_pos >> PAGE_SHIFT;
-        error = shmem_get_folio(inode, index, 0, &folio, SGP_READ);
-        if (error) {
-            if (error == -EINVAL)
-                error = 0;
-            break;
+fetch:
+        folio = folio_batch_next(&fbatch);
+        if (!folio) {
+            pgoff_t start = index;
+            pgoff_t end = (iocb->ki_pos + to->count - 1) >> PAGE_SHIFT;
+
+            if (folio_batch_count(&fbatch)) {
+                for (int i = 0; i < folio_batch_count(&fbatch); i++)
+                    folio_put(fbatch.folios[i]);
+                folio_batch_reinit(&fbatch);
+            }
+
+            filemap_get_folios_contig(inode->i_mapping, &start, end, &fbatch);
+            if (folio_batch_count(&fbatch))
+                goto fetch;
+
+            error = shmem_get_folio(inode, index, 0, &folio, SGP_READ);
+            if (unlikely(error)) {
+                if (error == -EINVAL)
+                    error = 0;
+                break;
+            }
+            if (folio) {
+                folio_unlock(folio);
+                folio_batch_add(&fbatch, folio);
+                fbatch.i++;
+            }
          }
-        if (folio) {
-            folio_unlock(folio);
+        if (folio) {
              page = folio_file_page(folio, index);
              if (PageHWPoison(page)) {
-                folio_put(folio);
                  error = -EIO;
                  break;
              }

I haven't tested it yet, but I'm sure this will break the fallback_page_copy mode.

You're fetching from the batch at folio granularity here, but in fallback_page_copy mode, we still copy the data at page granularity.

Yes, that's a bug, Sashika has already reported it. I’ll fix it in the next version.

Good.

Sashika also reported another potential issue below, but I think that is a misunderstanding, because a folio fetched from the swap cache is already marked Uptodate before being added to the page cache. What do you think?

I'm not concerned about swap-in folios, since we already ensure the folio is uptodate when swapping out.

What I'm worried about is that this change moves the folio_test_uptodate() check outside of the folio lock, which could introduce some race? Of course, with the original logic, the folio state could also transition from !uptodate to uptodate after the folio
is unlocked (I need more thought).

Another race condition that needs to be addressed is, when getting the folio locklessly via filemap_get_folios_contig(), there could be a concurrent race with swap-out (because you removed the folio->mapping check under folio lock). This could result in reading zero data when it shouldn't (the folio actually needs to be swapped in).