Re: Regression with v5.18-rc1 tag on STM32F7 and STM32H7 based boards

From: Hugh Dickins
Date: Fri Apr 15 2022 - 22:07:47 EST


On Wed, 6 Apr 2022, Patrice CHOTARD wrote:
> On 4/6/22 08:22, Hugh Dickins wrote:
> > Asking Arnd and others below: should noMMU arches have a good ZERO_PAGE?
> >
> > On Tue, 5 Apr 2022, Hugh Dickins wrote:
> >> On Tue, 5 Apr 2022, Patrice CHOTARD wrote:
> >>>
> >>> We found an issue with last kernel tag v5.18-rc1 on stm32f746-disco and
> >>> stm32h743-disco boards (ARMV7-M SoCs).
> >>>
> >>> Kernel hangs when executing SetPageUptodate(ZERO_PAGE(0)); in mm/filemap.c.
> >>>
> >>> By reverting commit 56a8c8eb1eaf ("tmpfs: do not allocate pages on read"),
> >>> kernel boots without any issue.
> >>
> >> Sorry about that, thanks a lot for finding.
> >>
> >> I see that arch/arm/configs/stm32_defconfig says CONFIG_MMU is not set:
> >> please confirm that is the case here.
> >>
> >> Yes, it looks as if NOMMU platforms are liable to have a bogus (that's my
> >> reading, but it may be unfair) definition for ZERO_PAGE(vaddr), and I was
> >> walking on ice to touch it without regard for !CONFIG_MMU.
> >>
> >> CONFIG_SHMEM depends on CONFIG_MMU, so that PageUptodate is only needed
> >> when CONFIG_MMU.
> >>
> >> Easily fixed by an #ifdef CONFIG_MMU there in mm/filemap.c, but I'll hunt
> >> around (again) for a better place to do it - though I won't want to touch
> >> all the architectures for it. I'll post later today.
> >
> > I could put #ifdef CONFIG_MMU around the SetPageUptodate(ZERO_PAGE(0))
> > added to pagecache_init(); or if that's considered distasteful, I could
> > skip making it potentially useful to other filesystems, revert the change
> > to pagecache_init(), and just do it in mm/shmem.c's CONFIG_SHMEM (hence
> > CONFIG_MMU) instance of shmem_init().
> >
> > But I wonder if it's safe for noMMU architectures to go on without a
> > working ZERO_PAGE(0). It has uses scattered throughout the tree, in
> > drivers, fs, crypto and more, and it's not at all obvious (to me) that
> > they all depend on CONFIG_MMU. Some might cause (unreported) crashes,
> > some might use an unzeroed page in place of a pageful of zeroes.
> >
> > arm noMMU and h8300 noMMU and m68k noMMU each has
> > #define ZERO_PAGE(vaddr) (virt_to_page(0))
> > which seems riskily wrong to me.
> >
> > h8300 and m68k actually go to the trouble of allocating an empty_zero_page
> > for this, but then forget to link it up to the ZERO_PAGE(vaddr) definition,
> > which is what all the common code uses.
> >
> > arm noMMU does not presently allocate such a page; and I do not feel
> > entitled to steal a page from arm noMMU platforms, for a hypothetical
> > case, without agreement.
> >
> > But here's an unbuilt and untested patch for consideration - which of
> > course should be split in three if agreed (and perhaps the h8300 part
> > quietly forgotten if h8300 is already on its way out).
> >
> > (Yes, arm uses empty_zero_page in a different way from all the other
> > architectures; but that's okay, and I think arm's way, with virt_to_page()
> > already baked in, is better than the others; but I've no wish to get into
> > changing them.)
> >
> > Patrice, does this patch build and run for you? I have no appreciation
> > of arm early startup issues, and may have got it horribly wrong.
>
> This patch is okay on my side on both boards (STM32F7 and STM32H7), boot are OK.
>
> Thanks for your reactivity ;-)
> Patrice

Just to wrap up this thread: the tentative arch/ patches below did not
go into 5.18-rc2, but 5.18-rc3 will contain
1bdec44b1eee ("tmpfs: fix regressions from wider use of ZERO_PAGE")
which fixes a further issue, and deletes the line which gave you trouble.

With arch/h8300 removed from linux-next, and arch/arm losing a page by
the patch below, I don't think it's worth my arguing for those changes.
I'd still prefer arch/m68k to expose its empty_zero_page in ZERO_PAGE(),
or else not allocate it; but I won't be pursuing this further.

Thanks for reporting!
Hugh

> >
> > arch/arm/include/asm/pgtable-nommu.h | 3 ++-
> > arch/arm/mm/nommu.c | 16 ++++++++++++++++
> > arch/h8300/include/asm/pgtable.h | 6 +++++-
> > arch/h8300/mm/init.c | 5 +++--
> > arch/m68k/include/asm/pgtable_no.h | 5 ++++-
> > 5 files changed, 30 insertions(+), 5 deletions(-)
> >
> > --- a/arch/arm/include/asm/pgtable-nommu.h
> > +++ b/arch/arm/include/asm/pgtable-nommu.h
> > @@ -48,7 +48,8 @@ typedef pte_t *pte_addr_t;
> > * ZERO_PAGE is a global shared page that is always zero: used
> > * for zero-mapped memory areas etc..
> > */
> > -#define ZERO_PAGE(vaddr) (virt_to_page(0))
> > +extern struct page *empty_zero_page;
> > +#define ZERO_PAGE(vaddr) (empty_zero_page)
> >
> > /*
> > * Mark the prot value as uncacheable and unbufferable.
> > --- a/arch/arm/mm/nommu.c
> > +++ b/arch/arm/mm/nommu.c
> > @@ -24,6 +24,13 @@
> >
> > #include "mm.h"
> >
> > +/*
> > + * empty_zero_page is a special page that is used for
> > + * zero-initialized data and COW.
> > + */
> > +struct page *empty_zero_page;
> > +EXPORT_SYMBOL(empty_zero_page);
> > +
> > unsigned long vectors_base;
> >
> > #ifdef CONFIG_ARM_MPU
> > @@ -148,9 +155,18 @@ void __init adjust_lowmem_bounds(void)
> > */
> > void __init paging_init(const struct machine_desc *mdesc)
> > {
> > + void *zero_page;
> > +
> > early_trap_init((void *)vectors_base);
> > mpu_setup();
> > bootmem_init();
> > +
> > + zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
> > + if (!zero_page)
> > + panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
> > + __func__, PAGE_SIZE, PAGE_SIZE);
> > + empty_zero_page = virt_to_page(zero_page);
> > + flush_dcache_page(empty_zero_page);
> > }
> >
> > /*
> > --- a/arch/h8300/include/asm/pgtable.h
> > +++ b/arch/h8300/include/asm/pgtable.h
> > @@ -19,11 +19,15 @@ extern void paging_init(void);
> >
> > static inline int pte_file(pte_t pte) { return 0; }
> > #define swapper_pg_dir ((pgd_t *) 0)
> > +
> > +/* zero page used for uninitialized stuff */
> > +extern void *empty_zero_page;
> > +
> > /*
> > * ZERO_PAGE is a global shared page that is always zero: used
> > * for zero-mapped memory areas etc..
> > */
> > -#define ZERO_PAGE(vaddr) (virt_to_page(0))
> > +#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
> >
> > /*
> > * These would be in other places but having them here reduces the diffs.
> > --- a/arch/h8300/mm/init.c
> > +++ b/arch/h8300/mm/init.c
> > @@ -41,7 +41,8 @@
> > * ZERO_PAGE is a special page that is used for zero-initialized
> > * data and COW.
> > */
> > -unsigned long empty_zero_page;
> > +void *empty_zero_page;
> > +EXPORT_SYMBOL(empty_zero_page);
> >
> > /*
> > * paging_init() continues the virtual memory environment setup which
> > @@ -65,7 +66,7 @@ void __init paging_init(void)
> > * Initialize the bad page table and bad page to point
> > * to a couple of allocated pages.
> > */
> > - empty_zero_page = (unsigned long)memblock_alloc(PAGE_SIZE, PAGE_SIZE);
> > + empty_zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
> > if (!empty_zero_page)
> > panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
> > __func__, PAGE_SIZE, PAGE_SIZE);
> > --- a/arch/m68k/include/asm/pgtable_no.h
> > +++ b/arch/m68k/include/asm/pgtable_no.h
> > @@ -38,11 +38,14 @@ extern void paging_init(void);
> > #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) })
> > #define __swp_entry_to_pte(x) ((pte_t) { (x).val })
> >
> > +/* zero page used for uninitialized stuff */
> > +extern void *empty_zero_page;
> > +
> > /*
> > * ZERO_PAGE is a global shared page that is always zero: used
> > * for zero-mapped memory areas etc..
> > */
> > -#define ZERO_PAGE(vaddr) (virt_to_page(0))
> > +#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
> >
> > /*
> > * All 32bit addresses are effectively valid for vmalloc...
>