Re: [RFC PATCH 3/3] mm: entirely remove lru_add_drain in do_swap_page

From: Barry Song

Date: Wed Jun 17 2026 - 06:01:18 EST

On Wed, Jun 17, 2026 at 1:38 PM Kairui Song <ryncsn@xxxxxxxxx> wrote:
>
> On Thu, Jun 11, 2026 at 6:52 PM Barry Song (Xiaomi) <baohua@xxxxxxxxxx> wrote:
> >
> > We are doing a lot of redundant lru_add_drain() calls in
> > do_swap_page(), especially for synchronous I/O devices. For
> > example, the test program below currently ends up draining
> > lru_cache 100% of the time:
> >
> > int main(int argc, char *argv[])
> > {
> > int i;
> > #define SIZE 100*1024*1024
> > while(1) {
> > volatile int *p = mmap(0, SIZE, PROT_READ | PROT_WRITE,
> > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> >
> > for (int i = 0; i < SIZE/sizeof(int); i++)
> > p[i] = i%64;
> > madvise((void *)p, SIZE, MADV_PAGEOUT);
> > for (int i = 0; i < SIZE/sizeof(int); i++)
> > p[i] = i%64;
> > munmap(p, SIZE);
> > }
> > return 0;
> > }
> >
> > Folio reuse now relies primarily on the exclusive hint, making
> > lru_cache draining to drop the refcount in lru_cache largely
> > irrelevant.
> >
> > Signed-off-by: Barry Song (Xiaomi) <baohua@xxxxxxxxxx>
> > ---
> > mm/memory.c | 10 ----------
> > 1 file changed, 10 deletions(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index ce8ef27e7a54..b5a78670bcc8 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4903,16 +4903,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> > } else if (folio != swapcache)
> > page = folio_page(folio, 0);
> >
> > - /*
> > - * If we want to map a page that's in the swapcache writable, we
> > - * have to detect via the refcount if we're really the exclusive
> > - * owner. Try removing the extra reference from the local LRU
> > - * caches if required.
> > - */
> > - if ((vmf->flags & FAULT_FLAG_WRITE) &&
> > - !folio_test_ksm(folio) && !folio_test_lru(folio))
> > - lru_add_drain();
> > -
> > folio_throttle_swaprate(folio, GFP_KERNEL);
> >
> > /*
> > --
> > 2.39.3 (Apple Git-146)
> >
> >
>
> There is a ref check in should_try_to_free_swap though, perhaps we
> should also improve that part too or we may have more more folios
> stuck in swap cache after this change?

Good catch. We could assume that a !LRU folio has an extra reference held
by lru_cache. Does the below make sense to you?

diff --git a/mm/memory.c b/mm/memory.c
index ff338c2abe92..237220ae1572 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5094,8 +5094,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
* Remove the swap entry and conditionally try to free up the swapcache.
* Do it after mapping, so raced page faults will likely see the folio
* in swap cache and wait on the folio lock.
+ * Assume that lru_cache contributes an extra reference for non-LRU
+ * folios.
*/
- if (should_try_to_free_swap(si, folio, vma, nr_pages, vmf->flags))
+ if (should_try_to_free_swap(si, folio, vma, nr_pages +
+ !folio_test_lru(folio), vmf->flags))
folio_free_swap(folio);

folio_unlock(folio);