Re: [RFC PATCH] mm: only set fault addrsss' access bit in do_anonymous_page

From: Wenchao Hao

Date: Tue Feb 10 2026 - 19:50:00 EST


On Tue, Feb 10, 2026 at 5:07 PM David Hildenbrand (Arm)
<david@xxxxxxxxxx> wrote:
>
> On 2/10/26 05:34, Wenchao Hao wrote:
> > When do_anonymous_page() creates mappings for huge pages, it currently sets
> > the access bit for all mapped PTEs (Page Table Entries) by default.
> >
> > This causes an issue where the Referenced field in /proc/pid/smaps cannot
> > distinguish whether a page was actually accessed.
>
> What is the use case that cares about that?
>

We have enabled 64KB large folios on Android devices, which may introduce
some memory waste. I want to figure out the proportion of memory waste
caused by large folios. Reading the "Referenced" field from /proc/pid/smaps
is a relatively low-cost method.

Additionally, considering future hot/cold page identification, we aim to
detect 64KB large folios where some pages are actually unaccessed and split
them into normal pages to avoid memory waste.

However, the current large folio implementation sets the access bit for all
page table entries (PTEs) of the large folio in the do_anonymous_page
function, making it hard to distinguish whether pre-allocated pages were
truly accessed.

> What we have right now is the exact same behavior as if you would get a
> PMD THP that has a single access+dirty bit at fault time.
>
> Also, architectures that support transparent PTE coalescing will not be
> able to coalesce until all PTE bits are equal.
>
> This level of imprecision is to be expected with large folios that only
> have a single access+dirty bit.
>

Thanks a lot for the response.

I saw this description in the ARM manual, “D8.5.5 Use of the Contiguous bit
with hardware updates to the translation tables”:


> If hardware updates a translation table entry, and if the Contiguous bit in
> that entry is 1, then the members in a group of contiguous translation table
> entries can have different AF, AP[2], and S2AP[1] values.

Does this mean that after hardware aggregates multiple PTEs, it can still
independently set the AF and other flag bits corresponding to specific
sub-PTE?

If so, can software also set different AF bits for a group of 16 PTEs
without affecting the transparent PTE coalescing function?

The reason I have this confusion is that there is such a description in
“D8.7.1 The Contiguous bit:”

> Software is required to ensure that all of the adjacent translation table
> entries for the contiguous region point to a contiguous OA range with
> consistent attributes and permissions.

It does not specify whether attributes and permissions include the AF bit.

> --
> Cheers,
>
> David