Re: [PATCH v4 0/2] mm: improve folio refcount scalability
From: ilya . gladyshev
Date: Sat Jun 20 2026 - 14:19:47 EST
>
> >
> > This patch optimizes small file read performance and overall folio refcount
> > scalability by refactoring page_ref_add_unless [core of folio_try_get].
> > This is alternative approach to previous attempts to fix small read
> > performance by avoiding refcount bumps [1][2].
> >
> Thanks. Nice numbers.
>
> AI review had some things to say:
> https://sashiko.dev/#/patchset/df26082871b4c65b2bd38d409026237c08572836@xxxxxxxxx
Among some minor issues, it also pointed out a funny ABA race:
```
T1/T2 work with pages of type X.
T3 works with pages of type Y.
T1: page_dec_and_test()
T1: -> sub refcount [1 -> 0]
T1: -> *interrupted* (very bad hypervisor, for example)
T2: optimistic get() [0 -> 1]
T2: put page back [1 -> 0]
T2: calls dtor for type X, returns into the allocator
T3: receives page of type Y, sets refcount to 1
T3: page_dec_and_test()
T3: -> sub refcount [1 -> 0]
*T1 resumes execution*
T1: -> CAS [0->LOCKED]
T1: BUG: calls dtor of type X on page of type Y
```
While this race seems unrealistic to me because of the full allocator
cycle between the two atomic operations, I wasn't able to prove it at
the first attempt. Maybe there is some synchronization in allocator
that forbids at least X != Y, or something.
I'll try to research fixes/proofs a little bit more, but I am afraid
that unless someone wise with mm/ knowledge comes up with some fact that
I missed, this patch indeed has a major (but unrealistic) flaw.
--
Sorry for the delay, grass was more touchable than ever
Ilya Gladyshev