Re: [PATCH v2 4/4] mm: try to free swapcache for non-LRU folios
From: David Hildenbrand (Arm)
Date: Fri Jun 26 2026 - 12:36:33 EST
>>> That changed that behavior such that we *must* now always fallback to do_wp_page().
>>>
>>> What a mess (I didn't ack)
>
> That's awkward, my bad, terribly sorry about this :(
>
> I sincerely apologize for the oversight and the trouble this has caused.
>
All good, I was just surprised to see a previous optimization partially
reverted without a clear reasoning :)
Because it should have removed the handling in should_try_to_free_swap() as well.
It's good that we are discussing it now!
> I haven't seen any performance regression in any workload recently
> though, or any correctness issue, perhaps the round trip of
> do_wp_page wasn't that bad. It should still catch the reuse folios,
> just more costly than doing things in-place.
Right, do_wp_page() handles it, after the page was mapped. It adds some overhead,
but fortunately no TLB flush if we're just upgrading write permissions.
The optimization dates back to pre PageAnonExclusive handling.
>
> I think we should restore the original check first. We might also want to
> avoid dropping the swap cache if the folio will not be reused, which
> was discussed here:
>
> https://lore.kernel.org/linux-mm/CAMgjq7BDfvNXdWH0cqarsujjUn3i3tDDhDkmSg01TR4h-tDorQ@xxxxxxxxxxxxxx/
>
> Maybe extracting some common part into a helper can help make this
> cleaner.
>
>>
>> So it seems we need a patch before patch 4 to fix the missed
>> swapcache freeing on write faults, because the write bit has already
>> been cleared by the time the folio is mapped and reused?
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 14577c67c61a..e8fc5a86f6ea 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -4752,6 +4752,7 @@ static void check_swap_exclusive(struct folio
>> *folio, swp_entry_t entry,
>> */
>> vm_fault_t do_swap_page(struct vm_fault *vmf)
>> {
>> + enum fault_flag fault_flags = vmf->flags;
>> struct vm_area_struct *vma = vmf->vma;
>> struct folio *swapcache = NULL, *folio;
>> struct page *page;
>> @@ -5091,7 +5092,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>> * an additional reference to the folio.
>> */
>> if (should_try_to_free_swap(si, folio, vma, nr_pages +
>> - !folio_test_lru(folio), vmf->flags))
>> + !folio_test_lru(folio), fault_flags))
>> folio_free_swap(folio);
>>
>> folio_unlock(folio);
>>
>>
>> Best Regards
>> Barry
>
> Hi Barry,
>
> The problem is more than that, the `exclusive || folio_ref_count(folio) == 1`
> in do_swap_page is also ineffective now.
Exactly.
If the roundtrip through do_wp_page() is good enough today, we can just do
@@ -4512,7 +4516,6 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf)
static inline bool should_try_to_free_swap(struct swap_info_struct *si,
struct folio *folio,
struct vm_area_struct *vma,
- unsigned int extra_refs,
unsigned int fault_flags)
{
if (!folio_test_swapcache(folio))
@@ -4528,14 +4531,7 @@ static inline bool should_try_to_free_swap(struct swap_info_struct *si,
if (mem_cgroup_swap_full(folio) || (vma->vm_flags & VM_LOCKED) ||
folio_test_mlocked(folio))
return true;
- /*
- * If we want to map a page that's in the swapcache writable, we
- * have to detect via the refcount if we're really the exclusive
- * user. Try freeing the swapcache to get rid of the swapcache
- * reference only in case it's likely that we'll be the exclusive user.
- */
- return (fault_flags & FAULT_FLAG_WRITE) && !folio_test_ksm(folio) &&
- folio_ref_count(folio) == (extra_refs + folio_nr_pages(folio));
+ return false;
}
static vm_fault_t pte_marker_clear(struct vm_fault *vmf)
@@ -5095,7 +5091,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
* Do it after mapping, so raced page faults will likely see the folio
* in swap cache and wait on the folio lock.
*/
- if (should_try_to_free_swap(si, folio, vma, nr_pages, vmf->flags))
+ if (should_try_to_free_swap(si, folio, vma, vmf->flags))
folio_free_swap(folio);
folio_unlock(folio);
But then, one question is whether we'd actually want to try removing the swapcache when
we mapped the page writable (iow: exclusive)?
@@ -4512,7 +4516,7 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf)
static inline bool should_try_to_free_swap(struct swap_info_struct *si,
struct folio *folio,
struct vm_area_struct *vma,
- unsigned int extra_refs,
+ bool exclusive,
unsigned int fault_flags)
{
if (!folio_test_swapcache(folio))
@@ -4529,13 +4533,11 @@ static inline bool should_try_to_free_swap(struct swap_info_struct *si,
folio_test_mlocked(folio))
return true;
/*
- * If we want to map a page that's in the swapcache writable, we
- * have to detect via the refcount if we're really the exclusive
- * user. Try freeing the swapcache to get rid of the swapcache
- * reference only in case it's likely that we'll be the exclusive user.
+ * We have an exclusive page that was mapped writable or will soon
+ * be mapped writable (as we are in a write fault). Let's just try
+ * to reclaim swap immediately.
*/
- return (fault_flags & FAULT_FLAG_WRITE) && !folio_test_ksm(folio) &&
- folio_ref_count(folio) == (extra_refs + folio_nr_pages(folio));
+ return (fault_flags & FAULT_FLAG_WRITE) && exclusive;
}
static vm_fault_t pte_marker_clear(struct vm_fault *vmf)
@@ -5095,7 +5097,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
* Do it after mapping, so raced page faults will likely see the folio
* in swap cache and wait on the folio lock.
*/
- if (should_try_to_free_swap(si, folio, vma, nr_pages, vmf->flags))
+ if (should_try_to_free_swap(si, folio, vma, exclusive, vmf->flags))
folio_free_swap(folio);
folio_unlock(folio);
--
Cheers,
David