Re: [PATCH v2] mm: use clear_user_(high)page() for arch with special user folio handling

From: Mathieu Desnoyers
Date: Sat Dec 07 2024 - 10:31:56 EST


On 2024-12-06 12:42, Zi Yan wrote:
For architectures setting cpu_dcache_is_aliasing() to true, which require
flushing cache, and arc, which changes folio->flags after clearing a user
folio, __GFP_ZERO using only clear_page() is not enough to zero user
folios and clear_user_(high)page() must be used. Otherwise, user data
will be corrupted.

Fix it by always clearing user folios with clear_user_(high)page() when
cpu_dcache_is_aliasing() is true or architecture is arc. Rename
alloc_zeroed() to alloc_need_zeroing() and invert the logic to clarify its
intend.

Fixes: 5708d96da20b ("mm: avoid zeroing user movable page twice with init_on_alloc=1")
Reported-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>
Closes: https://lore.kernel.org/linux-mm/CAMuHMdV1hRp_NtR5YnJo=HsfgKQeH91J537Gh4gKk3PFZhSkbA@xxxxxxxxxxxxxx/
Tested-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>
Signed-off-by: Zi Yan <ziy@xxxxxxxxxx>
---
include/linux/highmem.h | 8 +++++++-
include/linux/mm.h | 17 +++++++++++++++++
mm/huge_memory.c | 9 +++++----
mm/internal.h | 6 ------
mm/memory.c | 10 +++++-----
5 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 6e452bd8e7e3..d9beb8371daa 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -224,7 +224,13 @@ static inline
struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
unsigned long vaddr)
{
- return vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr);
+ struct folio *folio;
+
+ folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vaddr);
+ if (folio && alloc_need_zeroing())
+ clear_user_highpage(&folio->page, vaddr);
+
+ return folio;
}
#endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index c39c4945946c..ca8df5871213 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -31,6 +31,7 @@
#include <linux/kasan.h>
#include <linux/memremap.h>
#include <linux/slab.h>
+#include <linux/cacheinfo.h>
struct mempolicy;
struct anon_vma;
@@ -4175,6 +4176,22 @@ static inline int do_mseal(unsigned long start, size_t len_in, unsigned long fla
}
#endif
+/*
+ * alloc_need_zeroing checks if a user folio from page allocator needs to be
+ * zeroed or not.
+ */
+static inline bool alloc_need_zeroing(void)
+{
+ /*
+ * for user folios, arch with cache aliasing requires cache flush and
+ * arc changes folio->flags, so always return false to make caller use
+ * clear_user_page()/clear_user_highpage()
+ */
+ return (cpu_dcache_is_aliasing() || IS_ENABLED(CONFIG_ARC)) ||

Nack.

Can we please not go back to re-introducing arch-specific
conditionals in generic mm code after the cleanup I did when
introducing cpu_dcache_is_aliasing() in commit 8690bbcf3b70 ?

Based on commit eacd0e950dc2, AFAIU what you appear to need here
is to introduce a:

cpu_icache_is_aliasing() -> note the "i" for instruction cache

It would typically be directly set to

#define cpu_icache_is_aliasing() cpu_dcache_is_aliasing()

except on architecture like ARC when the icache vs dcache
is aliasing, but not dcache vs dcache.

So for ARC it would be defined as:

#define cpu_dcache_is_aliasing() false
#define cpu_icache_is_aliasing() true

And the Kconfig ARCH_HAS_CPU_CACHE_ALIASING=y would be set for ARC
again.

I'm not entirely sure if we want to go for the wording "is_aliasing"
or "is_incoherent" when talking about icache vs dcache, so I'm open
to ideas here.

Thanks,

Mathieu

+ !static_branch_maybe(CONFIG_INIT_ON_ALLOC_DEFAULT_ON,
+ &init_on_alloc);
+}
+
int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status);
int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status);
int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ee335d96fc39..107130a5413a 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1176,11 +1176,12 @@ static struct folio *vma_alloc_anon_folio_pmd(struct vm_area_struct *vma,
folio_throttle_swaprate(folio, gfp);
/*
- * When a folio is not zeroed during allocation (__GFP_ZERO not used),
- * folio_zero_user() is used to make sure that the page corresponding
- * to the faulting address will be hot in the cache after zeroing.
+ * When a folio is not zeroed during allocation (__GFP_ZERO not used)
+ * or user folios require special handling, folio_zero_user() is used to
+ * make sure that the page corresponding to the faulting address will be
+ * hot in the cache after zeroing.
*/
- if (!alloc_zeroed())
+ if (alloc_need_zeroing())
folio_zero_user(folio, addr);
/*
* The memory barrier inside __folio_mark_uptodate makes sure that
diff --git a/mm/internal.h b/mm/internal.h
index cb8d8e8e3ffa..3bd08bafad04 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1285,12 +1285,6 @@ void touch_pud(struct vm_area_struct *vma, unsigned long addr,
void touch_pmd(struct vm_area_struct *vma, unsigned long addr,
pmd_t *pmd, bool write);
-static inline bool alloc_zeroed(void)
-{
- return static_branch_maybe(CONFIG_INIT_ON_ALLOC_DEFAULT_ON,
- &init_on_alloc);
-}
-
/*
* Parses a string with mem suffixes into its order. Useful to parse kernel
* parameters.
diff --git a/mm/memory.c b/mm/memory.c
index 75c2dfd04f72..cf1611791856 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4733,12 +4733,12 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf)
folio_throttle_swaprate(folio, gfp);
/*
* When a folio is not zeroed during allocation
- * (__GFP_ZERO not used), folio_zero_user() is used
- * to make sure that the page corresponding to the
- * faulting address will be hot in the cache after
- * zeroing.
+ * (__GFP_ZERO not used) or user folios require special
+ * handling, folio_zero_user() is used to make sure
+ * that the page corresponding to the faulting address
+ * will be hot in the cache after zeroing.
*/
- if (!alloc_zeroed())
+ if (alloc_need_zeroing())
folio_zero_user(folio, vmf->address);
return folio;
}

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com