Re: [v3 PATCH] arm64: mm: Fix kexec failure after pte_mkwrite_novma() change
From: Will Deacon
Date: Fri Feb 13 2026 - 06:56:37 EST
On Thu, Feb 12, 2026 at 10:51:45AM -0800, Guenter Roeck wrote:
> On Thu, Dec 04, 2025 at 02:27:22PM +0800, Jianpeng Chang wrote:
> > Commit 143937ca51cc ("arm64, mm: avoid always making PTE dirty in
> > pte_mkwrite()") modified pte_mkwrite_novma() to only clear PTE_RDONLY
> > when the page is already dirty (PTE_DIRTY is set). While this optimization
> > prevents unnecessary dirty page marking in normal memory management paths,
> > it breaks kexec on some platforms like NXP LS1043.
> >
> > The issue occurs in the kexec code path:
> > 1. machine_kexec_post_load() calls trans_pgd_create_copy() to create a
> > writable copy of the linear mapping
> > 2. _copy_pte() calls pte_mkwrite_novma() to ensure all pages in the copy
> > are writable for the new kernel image copying
> > 3. With the new logic, clean pages (without PTE_DIRTY) remain read-only
> > 4. When kexec tries to copy the new kernel image through the linear
> > mapping, it fails on read-only pages, causing the system to hang
> > after "Bye!"
> >
> > The same issue affects hibernation which uses the same trans_pgd code path.
> >
> > Fix this by marking pages dirty with pte_mkdirty() in _copy_pte(), which
> > ensures pte_mkwrite_novma() clears PTE_RDONLY for both kexec and
> > hibernation, making all pages in the temporary mapping writable regardless
> > of their dirty state. This preserves the original commit's optimization
> > for normal memory management while fixing the kexec/hibernation regression.
> >
> > Using pte_mkdirty() causes redundant bit operations when the page is
> > already writable (redundant PTE_RDONLY clearing), but this is acceptable
> > since it's not a hot path and only affects kexec/hibernation scenarios.
> >
> > Fixes: 143937ca51cc ("arm64, mm: avoid always making PTE dirty in pte_mkwrite()")
> > Signed-off-by: Jianpeng Chang <jianpeng.chang.cn@xxxxxxxxxxxxx>
> > Reviewed-by: Huang Ying <ying.huang@xxxxxxxxxxxxxxxxx>
>
> We (Google) experience this problem with servers utilizing the Ampere Siryn
> CPU. It now bubbled down all the way to v6.6.y (and maybe further),
> essentially making kexec unusable on affected systems unless the backport
> of commit 143937ca51cc is dropped.
>
> What is the status of this patch ?
Catalin and I would prefer to treat kernel mappings as dirty, as
suggested in:
https://lore.kernel.org/r/aVgUPNzXHHIBhh5A@xxxxxxx
If somebody sends a (tested) patch, we'll take it.
Will