RE: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above zero

From: Elad Nachman
Date: Thu Jan 04 2024 - 07:39:12 EST




> -----Original Message-----
> From: Florian Fainelli <florian.fainelli@xxxxxxxxxxxx>
> Sent: Wednesday, January 3, 2024 8:32 PM
> To: Will Deacon <will@xxxxxxxxxx>; Elad Nachman
> <enachman@xxxxxxxxxxx>
> Cc: catalin.marinas@xxxxxxx; thunder.leizhen@xxxxxxxxxx;
> bhe@xxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; yajun.deng@xxxxxxxxx;
> chris.zjh@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-arm-
> kernel@xxxxxxxxxxxxxxxxxxx
> Subject: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above
> zero
>
> External Email
>
> ----------------------------------------------------------------------
> On 1/3/24 09:45, Will Deacon wrote:
> > On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote:
> >> From: Elad Nachman <enachman@xxxxxxxxxxx>
> >>
> >> Some SOCs, like the Marvell AC5/X/IM, have a combination
> >> of DDR starting at 0x2_0000_0000 coupled with DMA controllers
> >> limited to 31 and 32 bit of addressing.
> >> This requires to properly arrange ZONE_DMA and ZONE_DMA32 for
> >> these SOCs, so swiotlb and coherent DMA allocation would work
> >> properly.
> >> Change initialization so device tree dma zone bits are taken as
> >> function of offset from DRAM start, and when calculating the
> >> maximal zone physical RAM address for physical DDR starting above
> >> 32-bit, combine the physical address start plus the zone mask
> >> passed as parameter.
> >> This creates the proper zone splitting for these SOCs:
> >> 0..2GB for ZONE_DMA
> >> 2GB..4GB for ZONE_DMA32
> >> 4GB..8GB for ZONE_NORMAL
> >>
> >> Signed-off-by: Elad Nachman <enachman@xxxxxxxxxxx>
> >> ---
> >> arch/arm64/mm/init.c | 20 +++++++++++++++-----
> >> 1 file changed, 15 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >> index 74c1db8ce271..8288c778916e 100644
> >> --- a/arch/arm64/mm/init.c
> >> +++ b/arch/arm64/mm/init.c
> >> @@ -115,20 +115,21 @@ static void __init
> arch_reserve_crashkernel(void)
> >>
> >> /*
> >> * Return the maximum physical address for a zone accessible by the
> given bits
> >> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum
> >> - * available memory, otherwise cap it at 32-bit.
> >> + * limit. If DRAM starts above 32-bit, expand the zone to the available
> memory
> >> + * start limited by the zone bits mask, otherwise cap it at 32-bit.
> >> */
> >> static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
> >> {
> >> phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits);
> >> phys_addr_t phys_start = memblock_start_of_DRAM();
> >> + phys_addr_t phys_end = memblock_end_of_DRAM();
> >>
> >> if (phys_start > U32_MAX)
> >> - zone_mask = PHYS_ADDR_MAX;
> >> + zone_mask = phys_start | zone_mask;
> >> else if (phys_start > zone_mask)
> >> zone_mask = U32_MAX;
> >>
> >> - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1;
> >> + return min(zone_mask, phys_end - 1) + 1;
> >> }
> >>
> >> static void __init zone_sizes_init(void)
> >> @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void)
> >>
> >> #ifdef CONFIG_ZONE_DMA
> >> acpi_zone_dma_bits =
> fls64(acpi_iort_dma_get_max_cpu_address());
> >> - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL));
> >> + /*
> >> + * When calculating the dma zone bits from the device tree, subtract
> >> + * the DRAM start address, in case it does not start from address
> >> + * zero. This way. we pass only the zone size related bits to
> >> + * max_zone_phys(), which will add them to the base of the DRAM.
> >> + * This prevents miscalculations on arm64 SOCs which combines
> >> + * DDR starting above 4GB with memory controllers limited to
> >> + * 32-bits or less:
> >> + */
> >> + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) -
> memblock_start_of_DRAM());
> >> zone_dma_bits = min3(32U, dt_zone_dma_bits,
> acpi_zone_dma_bits);
> >> arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
> >> max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
> >
> > Hmm, I'm a bit worried this could regress other platforms since you seem
> > to be assuming that DMA address 0 corresponds to the physical start of
> > DRAM. What if that isn't the case?
>
> All of our most recent Set-top-box SoCs map DRAM starting at PA
> 0x4000_0000 FWIW. We have the following memory maps:

That is below 4GB

>
> 1. DRAM mapped at PA 0x0000_0000
> 2. DRAM mapped at PA 0x4000_0000 with another memory controller
> starting
> at 0x3_0000_0000
> 3. DRAM mapped at PA 0x4000_0000 with a single memory controller.
>
> Here is the before/after diff with debugging prints introduced to print
> start, end, min, mask and the dt_zone_dma_bits value:
>
> Memory map 1. with 2GB -> no change in output
>
> Memory map 2. with 2+2GB -> no change in output
>
> Memory map 3. with 4GB:
>
> @@ -39,7 +39,7 @@
> [ 0.000000] OF: reserved mem:
> 0x00000000fdfff000..0x00000000fdffffff (4 KiB) nomap non-reusable
> NWMBOX
> [ 0.000000] OF: reserved mem:
> 0x00000000fe000000..0x00000000ffffffff (32768 KiB) nomap non-reusable
> SRR@fe000000
> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
> 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff
> -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000021
> +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020
> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
> 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff
> [ 0.000000] Zone ranges:
> [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff]
> @@ -88,243 +88,243 @@
>
> Memory map 3. with 2GB:
>
> @@ -39,11 +39,11 @@
> [ 0.000000] OF: reserved mem:
> 0x00000000bdfff000..0x00000000bdffffff (4 KiB) nomap non-reusable
> NWMBOX
> [ 0.000000] OF: reserved mem:
> 0x00000000be000000..0x00000000bfffffff (32768 KiB) nomap non-reusable
> SRR@be000000
> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
> 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff
> -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020
> -[ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
> 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff
> +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x0000001f
> +[ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
> 0x00000000c0000000, min: 0x0000000080000000, mask: 0x000000007fffffff
> [ 0.000000] Zone ranges:
> -[ 0.000000] DMA [mem 0x0000000040000000-0x00000000bfffffff]
> -[ 0.000000] DMA32 empty
> +[ 0.000000] DMA [mem 0x0000000040000000-0x000000007fffffff]
> +[ 0.000000] DMA32 [mem 0x0000000080000000-0x00000000bfffffff]
> [ 0.000000] Normal empty
> [ 0.000000] Movable zone start for each node
> [ 0.000000] Early memory node ranges
> @@ -54,7 +54,7 @@
> --
> Florian

Do you have an example where this works when DDR memory starts above 4GB?
The code in max_zone_phys() has separate if clauses for memory starting above 4GB,
and for memory starting above zone mask but below 4GB...

Elad.