Re: [PATCH] mm/arm: pgtable: remove young bit check for pte_valid_user
From: Brian Ruley
Date: Thu Apr 09 2026 - 11:22:08 EST
On Apr 09, Will Deacon wrote:
>
> On Thu, Apr 09, 2026 at 03:54:45PM +0300, Brian Ruley wrote:
> > Fixes cache desync, which can cause undefined instruction,
> > translation and permission faults under heavy memory use.
> >
> > This is an old bug introduced in commit 1971188aa196 ("ARM: 7985/1: mm:
> > implement pte_accessible for faulting mappings"), which included a check
> > for the young bit of a PTE. The underlying assumption was that old pages
> > are not cached, therefore, `__sync_icache_dcache' could be skipped
> > entirely.
> >
> > However, under extreme memory pressure, page migrations happen
> > frequently and the assumption of uncached "old" pages does not hold.
> > Especially for systems that do not have swap, the migrated pages are
> > unequivocally marked old. This presents a problem, as it is possible
> > for the original page to be immediately mapped to another VA that
> > happens to share the same cache index in VIPT I-cache (we found this
> > bug on Cortex-A9). Without cache invalidation, the CPU will see the
> > old mapping whose physical page can now be used for a different
> > purpose, as illustrated below:
> >
> > Core Physical Memory
> > +-------------------------------+ +------------------+
> > | TLB | | |
> > | VA_A 0xb6e6f -> pfn_q | | pfn_q: code |
> > +-------------------------------+ +------------------+
> > | I-cache |
> > | set[VA_A bits] | tag=pfn_q |
> > +-------------------------------+
> >
> > migrate (kcompactd):
> > 1. copy pfn_q --> pfn_r
> > 2. free pfn_q
> > 3. pte: VA_a -> pfn_r
> > 4. pte_mkold(pte) --> !young
> > 5. ICIALLUIS skipped (because !young)
> >
> > pfn_src reused (OOM pressure):
> > pte: VA_B -> pfn_q (different code)
> >
> > bug:
> > Core Physical Memory
> > +-------------------------------+ +------------------+
> > | TLB (empty) | | pfn_r: old code |
> > +-------------------------------+ | pfn_q: new code |
> > | I-cache | +------------------+
> > | set[VA_A bits] | tag=pfn_q |<--- wrong instructions
> > +-------------------------------+
>
> (nit: Do you have pfn_r and pfn_q mixed up in the "Physical Memory" box?)
No, I don't think so. The intent was to show that whatever was copied
from pfn_q is now in pfn_r while the old page (pfn_q) is now mapped to
VA_B with new code/data. Maybe a classic case of poor naming on my part
here. :-)
>
> > This was verified on ba16-based board (i.MX6Quad/Dual, Cortex-A9) by
> > instrumenting the migration code to track recently migrated pages in a
> > ring buffer and then dumping them in the undefined instruction fault
> > handler. The bug can be triggered with `stress-ng':
> >
> > stress-ng --vm 4 --vm-bytes 2G --vm-method zero-one --verify
> >
> > Note that the system we tested on has only 2G of memory, so the test
> > triggered the OOM-killer in our case.
> >
> > Fixes: 1971188aa196 ("ARM: 7985/1: mm: implement pte_accessible for faulting mappings")
> > Signed-off-by: Brian Ruley <brian.ruley@xxxxxxxxxxxxxxxx>
> > ---
> > arch/arm/include/asm/pgtable.h | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> > index 6fa9acd6a7f5..e3a5b4a9a65f 100644
> > --- a/arch/arm/include/asm/pgtable.h
> > +++ b/arch/arm/include/asm/pgtable.h
> > @@ -185,7 +185,7 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
> > #define pte_exec(pte) (pte_isclear((pte), L_PTE_XN))
> >
> > #define pte_valid_user(pte) \
> > - (pte_valid(pte) && pte_isset((pte), L_PTE_USER) && pte_young(pte))
> > + (pte_valid(pte) && pte_isset((pte), L_PTE_USER))
>
> This patch is from twelve years ago, so please forgive me for having
> forgotten all of the details. However, my recollection is that when using
> the classic/!lpae format (as you will be on Cortex-A9), page aging is
> implemented by using invalid (translation faulting) ptes for 'old'
> mappings.
>
> So in the case you describe, we may well elide the I-cache maintenance,
> but won't we also put down an invalid pte? If we later take a fault
> on that, we should then perform the cache maintenance when installing
> the young entry (via ptep_set_access_flags()). The more interesting part
> is probably when the mapping for 'VA_B' is installed to map 'pfn_q' but,
> again, I would've expected the cache maintenance to happen just prior to
> installing the valid (young) mapping.
>
> Please can you help me to understand the problem better?
>
> Will
Hi,
I am, by no means, a domain expert either so I'll be deferring to your
judgement. That said, I believe what you said is correct and the
expectation is that we will later fault and then flush the cache and
fault.
However, in the case I describe, if VA_B is mapped immediately to pfn_q
after it been has unmapped and freed for VA_A, then it's quite possible
that the page is still indexed in the cache. The hypothesis is that if
VA_A and VA_B land in the same I-cache set and VA_A old cache entry
still exists (tagged with pfn_q), then the CPU can fetch stale
instructions because the tag will match. That's one reason why we need
to invalidate the cache, but that will be skipped in the path:
migrate_pages
migrate_pages_batch
migrate_folio_move
remove_migration_ptes
remove_migration_pte
set_pte_at
set_ptes
__sync_icache_dcache (skipped if !young)
set_pte_ext
And migrated pages are always marked old:
mm/migrate.c=static bool remove_migration_pte(struct folio *folio,
mm/migrate.c: if (!softleaf_is_migration_young(entry))
mm/migrate.c: pte = pte_mkold(pte);
include/linux/leafops.h:
static inline bool softleaf_is_migration_young(softleaf_t entry)
{
VM_WARN_ON_ONCE(!softleaf_is_migration(entry));
if (migration_entry_supports_ad())
return swp_offset(entry) & SWP_MIG_YOUNG;
/* Keep the old behavior of aging page after migration */
return false;
}
I might be misunderstanding something, this took us a while to figure
out. But the patch seems to work for us. I hope I explained it a bit
better now.
Best regards,
Brian