Re: [PATCH v6 3/4] arm64: use both ZONE_DMA and ZONE_DMA32

From: Catalin Marinas
Date: Tue Dec 03 2019 - 06:19:07 EST


On Tue, Dec 03, 2019 at 10:12:50AM +0000, Will Deacon wrote:
> On Mon, Dec 02, 2019 at 10:03:17PM -0800, John Stultz wrote:
> > Ok, narrowing it down further, it seems its the following bit from the
> > patch:
> >
> > > @@ -201,13 +212,18 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
> > > struct memblock_region *reg;
> > > unsigned long zone_size[MAX_NR_ZONES], zhole_size[MAX_NR_ZONES];
> > > unsigned long max_dma32 = min;
> > > + unsigned long max_dma = min;
> > >
> > > memset(zone_size, 0, sizeof(zone_size));
> > >
> > > - /* 4GB maximum for 32-bit only capable devices */
> > > +#ifdef CONFIG_ZONE_DMA
> > > + max_dma = PFN_DOWN(arm64_dma_phys_limit);
> > > + zone_size[ZONE_DMA] = max_dma - min;
> > > + max_dma32 = max_dma;
> > > +#endif
> > > #ifdef CONFIG_ZONE_DMA32
> > > max_dma32 = PFN_DOWN(arm64_dma32_phys_limit);
> > > - zone_size[ZONE_DMA32] = max_dma32 - min;
> > > + zone_size[ZONE_DMA32] = max_dma32 - max_dma;
> > > #endif
> > > zone_size[ZONE_NORMAL] = max - max_dma32;
> > >
> > > @@ -219,11 +235,17 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
> > >
> > > if (start >= max)
> > > continue;
> > > -
> > > +#ifdef CONFIG_ZONE_DMA
> > > + if (start < max_dma) {
> > > + unsigned long dma_end = min_not_zero(end, max_dma);
> > > + zhole_size[ZONE_DMA] -= dma_end - start;
> > > + }
> > > +#endif
> > > #ifdef CONFIG_ZONE_DMA32
> > > if (start < max_dma32) {
> > > - unsigned long dma_end = min(end, max_dma32);
> > > - zhole_size[ZONE_DMA32] -= dma_end - start;
> > > + unsigned long dma32_end = min(end, max_dma32);
> > > + unsigned long dma32_start = max(start, max_dma);
> > > + zhole_size[ZONE_DMA32] -= dma32_end - dma32_start;
> > > }
> > > #endif
> > > if (end > max_dma32) {
> >
> > The zhole_sizes end up being:
> > zhole_size: DMA: 67671, DMA32: 1145664 NORMAL: 0
> >
> > This seems to be due to dma32_start being calculated as 786432 each
> > time - I'm guessing that's the max_dma value.
> > Where dma32_end is around 548800, but changes each iteration (so we
> > end up subtracting a negative value each pass, growing the size).
[...]
> Anyway, I've had a go at fixing it below but it's 100% untested. I think
> the issue is that your SoC has memblocks contained entirely within the
> ZONE_DMA region and we don't cater for that at all when processing the
> ZONE_DMA32 region.

This seems to be issue, the SoC memory contained withing ZONE_DMA. I
managed to reproduce it under KVM/Qemu by reducing the amount of memory
given to the guest. You'd also need NUMA disabled to hit this code path.

Your proposed change fixes it but I'll send a tested-by on the full
patch when it hits the list.

--
Catalin