Re: [RFC PATCH] mm: only set fault addrsss' access bit in do_anonymous_page

From: David Hildenbrand (Arm)

Date: Thu Feb 12 2026 - 03:54:18 EST


On 2/12/26 02:57, Wenchao Hao wrote:
On Wed, Feb 11, 2026 at 5:05 PM David Hildenbrand (Arm)
<david@xxxxxxxxxx> wrote:

On 2/11/26 01:49, Wenchao Hao wrote:
On Tue, Feb 10, 2026 at 5:07 PM David Hildenbrand (Arm)
<david@xxxxxxxxxx> wrote:

We have enabled 64KB large folios on Android devices, which may introduce
some memory waste. I want to figure out the proportion of memory waste
caused by large folios. Reading the "Referenced" field from /proc/pid/smaps
is a relatively low-cost method.

Right. And that imprecision is to be expected when you opt-in into
something that manages memory in other granularity and only has a single
a/d bit: a large folio.

Sure, individual PTEs *might* have independent a/d bits, but the
underlying thing (folio) has only a single one. And optimizations that
build on top (pte coalescing) reuse that principle that having a single
logical a/d bit is fine.


Additionally, considering future hot/cold page identification, we aim to
detect 64KB large folios where some pages are actually unaccessed and split
them into normal pages to avoid memory waste.

However, the current large folio implementation sets the access bit for all
page table entries (PTEs) of the large folio in the do_anonymous_page
function, making it hard to distinguish whether pre-allocated pages were
truly accessed.

The deferred shrinker uses a much simpler mechanism: if the page content
is zero, likely it was over-allocated and never used later.

It's not completely lightweight (scan pages for 0 content), but is
reliable, independent of the mapping type (PMD, cont-pte, whatever) and
independent of any access/dirty bits, leaving performance unharmed.

When you say "I want to figure out the proportion of memory waste", are
we talking about a debug feature?


Thanks for your explanation. I now understand the design logic.

What I’m proposing is mainly for debugging. After enabling 64K large folio
on Android, we observed increased application memory footprint, especially
for anonymous pages.

Since Android app memory usage depends on runtime scenarios, we cannot
confirm if the growth is directly caused by large folio. We want to
analyze memory
usage via the `Referenced` field in `/proc/[pid]/smaps`.

Scanning for zero-filled pages will be much easier and more reliable. For a debug feature good enough.

I'm wondering what the best interface for something like that could be: we don't want to make "/proc/[pid]/smaps" slower for all users.

Maybe we could for debug kernels.

For example, adding with CONFIG_DEBUG_KERNEL a new entry

Anon_Zero:

counter that just tests whether the page content of an anonymous page is all zeroes could be doable.

--
Cheers,

David