Re: [PATCH 1/2] mm: store zero pages to be swapped out in a bitmap

From: Usama Arif
Date: Fri May 31 2024 - 14:19:12 EST



On 30/05/2024 21:04, Matthew Wilcox wrote:
On Thu, May 30, 2024 at 09:24:20AM -0700, Yosry Ahmed wrote:
I am wondering if it's even possible to take this one step further and
avoid reclaiming zero-filled pages in the first place. Can we just
unmap them and let the first read fault allocate a zero'd page like
uninitialized memory, or point them at the zero page and make them
read-only, or something? Then we could free them directly without
going into the swap code to begin with.
I was having similar thoughts. You can see in do_anonymous_page() that
we simply map the shared zero page when we take a read fault on
unallocated anon memory.

Thanks Yosry and Matthew. Currently trying to prototype and see how this might look. Hopefully should have an update next week.

So my question is where are all these zero pages coming from in the Meta
fleet? Obviously we never try to swap out the shared zero page (it's
not on any LRU list). So I see three possibilities:

- Userspace wrote to it, but it wrote zeroes. Then we did a memcmp(),
discovered it was zeroes and fall into this path. It would be safe
to just discard this page.
- We allocated it as part of a THP. We never wrote to this particular
page of the THP, so it's zero-filled. While it's safe to just
discard this page, we might want to write it for better swap-in
performance.

Its mostly THP. Alex presented the numbers well in his THP series https://lore.kernel.org/lkml/cover.1661461643.git.alexlzhu@xxxxxx/


- Userspace wrote non-zeroes to it, then wrote zeroes to it before
abandoning use of this page, and so it eventually got swapped out.
Perhaps we could teach userspace to MADV_DONTNEED the page instead?

Has any data been gathered on this? Maybe there are other sources of
zeroed pages that I'm missing. I do remember a presentation at LSFMM
in 2022 from Google about very sparsely used THPs.