Re: [RFC PATCH 0/2] mm: improve folio refcount scalability
From: Gladyshev Ilya
Date: Tue Jan 13 2026 - 02:32:04 EST
On 1/12/2026 7:17 PM, Kiryl Shutsemau wrote:
On Mon, Jan 12, 2026 at 05:32:10PM +0300, Gladyshev Ilya wrote:It would be under the same branch as the single CAS that already exists in this patch:
On 1/12/2026 2:49 PM, Kiryl Shutsemau wrote:
On Mon, Jan 12, 2026 at 11:30:38AM +0300, Gladyshev Ilya wrote:I am not sure that overflow is a real problem because you need a very
Gentle ping on this proposal
I generally like the idea, but I would like to hear from folks who
actually understand serialization.
Also, do you have number for "a full CAS loop when the counter is
approaching overflow" thing?
specific race condition over a long time to achieve it...
Yes. But if the page is popular for pinning, GUP_PIN_COUNTING_BIAS can
cut the "very long time" substantially.
But as a safeguard, everything lower than 2^31 - #max concurrent
accesses (~#num cpu) should work, so let's say 2^30
What I meant is when we put a branch/loop in the hot path, your
performance numbers will likely not look as attractive. Am I wrong?
if (page_count_writable(page)) {
val = atomic_add_return(nr, &page->_refcount);
ret = !(val & PAGEREF_LOCKED_BIT);
if (unlikely(!ret)) {
atomic_cmpxchg_relaxed(&page->_refcount, val, PAGEREF_LOCKED_BIT);
/* [Proposed] if (failed && big enough) { CAS loop } */
}
}
Unless the "failed try_lock()" is the hot path somewhere[1], this added branch will be hidden under the already existing [unlikely taken] branch
[1]: Which I doubt, because failed try_lock() usually includes heavy re-lookup