Re: [PATCH] arm64: mm: fix zone_dma_limit calculation

From: Yang Shi
Date: Mon Dec 02 2024 - 11:41:25 EST




On 11/29/24 11:38 AM, Catalin Marinas wrote:
On Fri, Nov 29, 2024 at 06:06:50PM +0000, Robin Murphy wrote:
On 2024-11-27 5:49 pm, Catalin Marinas wrote:
If IORT or DT indicate a large mask covering the whole RAM (i.e. no
restrictions), in an ideal world, we should normally extend ZONE_DMA to
the same.
That's not right, ZONE_DMA should still be relatively limited in size
(unless we really do only have a tiny amount of RAM) - just because a DT
dma-ranges property says the system interconnect can carry >32 address bits
in general doesn't mean that individual device DMA masks can't still be
32-bit or smaller. IIRC we're still implicitly assuming that if DT does
describe an offset range into "high" RAM, it must represent a suitable
lowest common denominator for all relevant devices already, and therefore we
can get away with sizing ZONE_DMA off it blindly.
Fine by me to keep ZONE_DMA in the low range always. I was thinking of
only doing this if ZONE_DMA32 is enabled.

After staring at it for long enough, I think $SUBJECT patch is actually
correct as it is.
Thanks Robin for having a look. Can I add your reviewed-by?

In fact I'm now wondering why the fix was put inside
max_zone_phys() in the first place, since it's clearly pointless to clamp
DMA_BIT_MASK(32) to U32_MAX in the dma32_phys_limit case...
I came to the same conclusion. I think it might have been some left-over
from when we had a ZONE_DMA32 in the above 4GB (AMD Seattle?). Than we
changed it a few times but only focused on this function for setting the
limits.

However the commit message is perhaps not as clear as it could be -
technically we are correctly *calculating* the appropriate effective
zone_dma_limt value within the scope of zone_sizes_init(), we're just
failing to properly update the actual zone_dma_limit variable for the
benefit of other users.
I'll have a look next week at rewriting the commit message, unless Yang
does it first. I'm planning to queue this patch for -rc2.

Hi Catalin and Robin,

Thanks for moving this forward.

How's about the below commit message?

We failed to properly update the actual zone_dma_limit variable. Now it is
the memsize limit in IORT or device tree instead of U32_MAX if the memsize limit
is greater than U32_MAX.

The zone_dma_limit is used to determine whether GFP_DMA should be used or not
when allocating DMA buffers.  The wrong zone_dma_limit resulted in DMA allocations
use GFP_DMA even though the devices don't require it then fall into DMA zone on
node 0.  It caused regression on our two sockets systems due to excessive remote
memory access.