Re: Regression with v5.18-rc1 tag on STM32F7 and STM32H7 based boards

From: Hugh Dickins
Date: Wed Apr 06 2022 - 05:49:13 EST


Asking Arnd and others below: should noMMU arches have a good ZERO_PAGE?

On Tue, 5 Apr 2022, Hugh Dickins wrote:
> On Tue, 5 Apr 2022, Patrice CHOTARD wrote:
> >
> > We found an issue with last kernel tag v5.18-rc1 on stm32f746-disco and
> > stm32h743-disco boards (ARMV7-M SoCs).
> >
> > Kernel hangs when executing SetPageUptodate(ZERO_PAGE(0)); in mm/filemap.c.
> >
> > By reverting commit 56a8c8eb1eaf ("tmpfs: do not allocate pages on read"),
> > kernel boots without any issue.
>
> Sorry about that, thanks a lot for finding.
>
> I see that arch/arm/configs/stm32_defconfig says CONFIG_MMU is not set:
> please confirm that is the case here.
>
> Yes, it looks as if NOMMU platforms are liable to have a bogus (that's my
> reading, but it may be unfair) definition for ZERO_PAGE(vaddr), and I was
> walking on ice to touch it without regard for !CONFIG_MMU.
>
> CONFIG_SHMEM depends on CONFIG_MMU, so that PageUptodate is only needed
> when CONFIG_MMU.
>
> Easily fixed by an #ifdef CONFIG_MMU there in mm/filemap.c, but I'll hunt
> around (again) for a better place to do it - though I won't want to touch
> all the architectures for it. I'll post later today.

I could put #ifdef CONFIG_MMU around the SetPageUptodate(ZERO_PAGE(0))
added to pagecache_init(); or if that's considered distasteful, I could
skip making it potentially useful to other filesystems, revert the change
to pagecache_init(), and just do it in mm/shmem.c's CONFIG_SHMEM (hence
CONFIG_MMU) instance of shmem_init().

But I wonder if it's safe for noMMU architectures to go on without a
working ZERO_PAGE(0). It has uses scattered throughout the tree, in
drivers, fs, crypto and more, and it's not at all obvious (to me) that
they all depend on CONFIG_MMU. Some might cause (unreported) crashes,
some might use an unzeroed page in place of a pageful of zeroes.

arm noMMU and h8300 noMMU and m68k noMMU each has
#define ZERO_PAGE(vaddr) (virt_to_page(0))
which seems riskily wrong to me.

h8300 and m68k actually go to the trouble of allocating an empty_zero_page
for this, but then forget to link it up to the ZERO_PAGE(vaddr) definition,
which is what all the common code uses.

arm noMMU does not presently allocate such a page; and I do not feel
entitled to steal a page from arm noMMU platforms, for a hypothetical
case, without agreement.

But here's an unbuilt and untested patch for consideration - which of
course should be split in three if agreed (and perhaps the h8300 part
quietly forgotten if h8300 is already on its way out).

(Yes, arm uses empty_zero_page in a different way from all the other
architectures; but that's okay, and I think arm's way, with virt_to_page()
already baked in, is better than the others; but I've no wish to get into
changing them.)

Patrice, does this patch build and run for you? I have no appreciation
of arm early startup issues, and may have got it horribly wrong.

Thanks,
Hugh

arch/arm/include/asm/pgtable-nommu.h | 3 ++-
arch/arm/mm/nommu.c | 16 ++++++++++++++++
arch/h8300/include/asm/pgtable.h | 6 +++++-
arch/h8300/mm/init.c | 5 +++--
arch/m68k/include/asm/pgtable_no.h | 5 ++++-
5 files changed, 30 insertions(+), 5 deletions(-)

--- a/arch/arm/include/asm/pgtable-nommu.h
+++ b/arch/arm/include/asm/pgtable-nommu.h
@@ -48,7 +48,8 @@ typedef pte_t *pte_addr_t;
* ZERO_PAGE is a global shared page that is always zero: used
* for zero-mapped memory areas etc..
*/
-#define ZERO_PAGE(vaddr) (virt_to_page(0))
+extern struct page *empty_zero_page;
+#define ZERO_PAGE(vaddr) (empty_zero_page)

/*
* Mark the prot value as uncacheable and unbufferable.
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -24,6 +24,13 @@

#include "mm.h"

+/*
+ * empty_zero_page is a special page that is used for
+ * zero-initialized data and COW.
+ */
+struct page *empty_zero_page;
+EXPORT_SYMBOL(empty_zero_page);
+
unsigned long vectors_base;

#ifdef CONFIG_ARM_MPU
@@ -148,9 +155,18 @@ void __init adjust_lowmem_bounds(void)
*/
void __init paging_init(const struct machine_desc *mdesc)
{
+ void *zero_page;
+
early_trap_init((void *)vectors_base);
mpu_setup();
bootmem_init();
+
+ zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+ if (!zero_page)
+ panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
+ __func__, PAGE_SIZE, PAGE_SIZE);
+ empty_zero_page = virt_to_page(zero_page);
+ flush_dcache_page(empty_zero_page);
}

/*
--- a/arch/h8300/include/asm/pgtable.h
+++ b/arch/h8300/include/asm/pgtable.h
@@ -19,11 +19,15 @@ extern void paging_init(void);

static inline int pte_file(pte_t pte) { return 0; }
#define swapper_pg_dir ((pgd_t *) 0)
+
+/* zero page used for uninitialized stuff */
+extern void *empty_zero_page;
+
/*
* ZERO_PAGE is a global shared page that is always zero: used
* for zero-mapped memory areas etc..
*/
-#define ZERO_PAGE(vaddr) (virt_to_page(0))
+#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))

/*
* These would be in other places but having them here reduces the diffs.
--- a/arch/h8300/mm/init.c
+++ b/arch/h8300/mm/init.c
@@ -41,7 +41,8 @@
* ZERO_PAGE is a special page that is used for zero-initialized
* data and COW.
*/
-unsigned long empty_zero_page;
+void *empty_zero_page;
+EXPORT_SYMBOL(empty_zero_page);

/*
* paging_init() continues the virtual memory environment setup which
@@ -65,7 +66,7 @@ void __init paging_init(void)
* Initialize the bad page table and bad page to point
* to a couple of allocated pages.
*/
- empty_zero_page = (unsigned long)memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+ empty_zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
if (!empty_zero_page)
panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
__func__, PAGE_SIZE, PAGE_SIZE);
--- a/arch/m68k/include/asm/pgtable_no.h
+++ b/arch/m68k/include/asm/pgtable_no.h
@@ -38,11 +38,14 @@ extern void paging_init(void);
#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) })
#define __swp_entry_to_pte(x) ((pte_t) { (x).val })

+/* zero page used for uninitialized stuff */
+extern void *empty_zero_page;
+
/*
* ZERO_PAGE is a global shared page that is always zero: used
* for zero-mapped memory areas etc..
*/
-#define ZERO_PAGE(vaddr) (virt_to_page(0))
+#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))

/*
* All 32bit addresses are effectively valid for vmalloc...