Re: [PATCH v2 2/2] mm/page_alloc: Prevent reporting pcp->batch = 0 [mcf5208evb boot failure]

From: Joshua Hahn
Date: Wed Dec 17 2025 - 00:17:06 EST


On Tue, 16 Dec 2025 13:47:03 -0800 Guenter Roeck <linux@xxxxxxxxxxxx> wrote:

> Hi,
>
> On Thu, Oct 09, 2025 at 12:29:31PM -0700, Joshua Hahn wrote:
> > zone_batchsize returns the appropriate value that should be used for
> > pcp->batch. If it finds a zone with less than 4096 pages or PAGE_SIZE >
> > 1M, however, it leads to some incorrect math.
> >
> > In the above case, we will get an intermediary value of 1, which is then
> > rounded down to the nearest power of two, and 1 is subtracted from it.
> > Since 1 is already a power of two, we will get batch = 1-1 = 0:
> >
> > batch = rounddown_pow_of_two(batch + batch/2) - 1;
> >
> > A pcp->batch value of 0 is nonsensical. If this were actually set, then
> > functions like drain_zone_pages would become no-ops, since they could
> > only free 0 pages at a time.
> >
> > Of the two callers of zone_batchsize, the one that is actually used to
> > set pcp->batch works around this by setting pcp->batch to the maximum
> > of 1 and zone_batchsize. However, the other caller, zone_pcp_init,
> > incorrectly prints out the batch size of the zone to be 0.
> >
> > This is probably rare in a typical zone, but the DMA zone can often have
> > less than 4096 pages, which means it will print out "LIFO batch:0".
> >
> > Before: [ 0.001216] DMA zone: 3998 pages, LIFO batch:0
> > After: [ 0.001210] DMA zone: 3998 pages, LIFO batch:1
> >
> > Instead of dealing with the error handling and the mismatch between the
> > reported and actual zone batchsize, just return 1 if the zone_batchsize
> > is 1 page or less before the rounding.
> >
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@xxxxxxxxx>
>
> With this patch in the tree, the qemu 'mcf5208evb' machine fails to boot
> with memory errors such as
>
> S01syslogd: page allocation failure: order:7, mode:0xcc0(GFP_KERNEL), nodemask=(null)
> CPU: 0 UID: 0 PID: 34 Comm: S01syslogd Not tainted 6.19.0-rc1 #1 NONE
> Stack from 407d7ce0:
> 407d7ce0 403df960 403df960 00000000 00000001 00000007 40027c60 403df960
> 400c06be 00000cc0 00000001 407d7d7e 400bf614 407d7d34 403df3ba 407d7d14
> 407d7db8 400c0e5c 00000cc0 00000000 403df3ba 00000007 00000007 00000cc0
> 000d8000 00000018 0000006c 00000001 00000000 40fe6640 00000000 40fe81e4
> 403ffa40 4085eff4 00000000 00000400 00000000 001008c0 00000000 40854041
> f4fe0000 00004041 f4fe0000 00000000 00010000 403ffa40 4085ed00 4085e800
> Call Trace: [<40027c60>] dump_stack+0xc/0x10
> [<400c06be>] warn_alloc+0xdc/0x1bc
> [<400bf614>] get_page_from_freelist+0x0/0xfa6
> [<400c0e5c>] __alloc_frozen_pages_noprof+0x6be/0x8be
> [<400c1358>] get_free_pages_noprof+0x16/0x3e
>
> Reverting this patch fixes the problem.

Hi Guenter,

Thank you for the report. Daniel Palmer has identified an issue on NOMMU
systems, and I think this is caused by the same issue. It seems like
mcf5208evb is also NOMMU (arch/m68k/Kconfig.cpu shows config M520x depends on
!MMU), so I imagine this is the same issue that was reported.

Andrew let me know that the commit has already been committed to mainline
so I'll be sending up a fix shortly. Sorry about the problem, and thank you
again for reporting it. I hope you have a great day!
Joshua