Re: [PATCH] arm64: mm: Fix kexec failure after pte_mkwrite_novma() change

From: Jianpeng Chang
Date: Mon Dec 01 2025 - 02:55:29 EST



On 11/28/25 5:32 PM, Huang, Ying wrote:
CAUTION: This email comes from a non Wind River email account!
Do not click links or open attachments unless you recognize the sender and know the content is safe.

Hi, Jianpeng,

Jianpeng Chang <jianpeng.chang.cn@xxxxxxxxxxxxx> writes:

Commit 143937ca51cc ("arm64, mm: avoid always making PTE dirty in
pte_mkwrite()") modified pte_mkwrite_novma() to only clear PTE_RDONLY
when the page is already dirty (PTE_DIRTY is set). While this optimization
prevents unnecessary dirty page marking in normal memory management paths,
it breaks kexec on some platforms like NXP LS1043.

The issue occurs in the kexec code path:
1. machine_kexec_post_load() calls trans_pgd_create_copy() to create a
writable copy of the linear mapping
2. _copy_pte() calls pte_mkwrite_novma() to ensure all pages in the copy
are writable for the new kernel image copying
3. With the new logic, clean pages (without PTE_DIRTY) remain read-only
4. When kexec tries to copy the new kernel image through the linear
mapping, it fails on read-only pages, causing the system to hang
after "Bye!"

The same issue affects hibernation which uses the same trans_pgd code path.

Fix this by explicitly clearing PTE_RDONLY in _copy_pte() for both
kexec and hibernation, ensuring all pages in the temporary mapping are
writable regardless of their dirty state. This preserves the original
commit's optimization for normal memory management while fixing the
kexec/hibernation regression.

Fixes: 143937ca51cc ("arm64, mm: avoid always making PTE dirty in pte_mkwrite()")
IMHO, this isn't the right "Fixes" tag. The original _copy_pte() code
should be the fixing target.

Hi Ying,

According to my understanding, the Fixes tag should point to the commit that directly

introduced the issue. While _copy_pte() was introduced with pte_mkwrite_novma() in

commit 6ecc21bb432d, at that time pte_mkwrite_novma() always cleared the PTE_RDONLY

bit unconditionally, and kexec worked correctly. Should we blame a change that was

working properly at the time, or am I missing something here?


Thanks,

Jianpeng


Signed-off-by: Jianpeng Chang <jianpeng.chang.cn@xxxxxxxxxxxxx>
---
arch/arm64/mm/trans_pgd.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
index 18543b603c77..ad4e5e4fcc91 100644
--- a/arch/arm64/mm/trans_pgd.c
+++ b/arch/arm64/mm/trans_pgd.c
@@ -40,8 +40,13 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr)
* Resume will overwrite areas that may be marked
* read only (code, rodata). Clear the RDONLY bit from
* the temporary mappings we use during restore.
+ *
+ * For kexec/hibernation, we need writable access regardless
+ * of the page's dirty state, so force clear PTE_RDONLY.
*/
- __set_pte(dst_ptep, pte_mkwrite_novma(pte));
+ pte = set_pte_bit(pte, __pgprot(PTE_WRITE));
+ pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY));
+ __set_pte(dst_ptep, pte);
Why not

__set_pte(dst_ptep, pte_mkwrite_novma(pte_mkdirty(pte));
I agree that using pte_mkdirty() is indeed better, makes the modification clearer and avoids the helper function. I will change it.

?

} else if (!pte_none(pte)) {
/*
* debug_pagealloc will removed the PTE_VALID bit if
@@ -57,7 +62,10 @@ static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr)
*/
BUG_ON(!pfn_valid(pte_pfn(pte)));

- __set_pte(dst_ptep, pte_mkvalid(pte_mkwrite_novma(pte)));
+ pte = pte_mkvalid(pte);
+ pte = set_pte_bit(pte, __pgprot(PTE_WRITE));
+ pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY));
+ __set_pte(dst_ptep, pte);
}
}
---
Best Regards,
Huang, Ying