Re: [RFC 3/4] dma-direct: add dma_direct_min_mask

From: Catalin Marinas
Date: Wed Jul 24 2019 - 09:51:30 EST

On Fri, Jul 19, 2019 at 03:08:52PM +0200, Nicolas Saenz Julienne wrote:
> On Thu, 2019-07-18 at 13:18 +0200, Nicolas Saenz Julienne wrote:
> > On Thu, 2019-07-18 at 11:15 +0200, Christoph Hellwig wrote:
> > > On Wed, Jul 17, 2019 at 05:31:34PM +0200, Nicolas Saenz Julienne wrote:
> > > > Historically devices with ZONE_DMA32 have been assumed to be able to
> > > > address at least the lower 4GB of ram for DMA. This is still the defualt
> > > > behavior yet the Raspberry Pi 4 is limited to the first GB of memory.
> > > > This has been observed to trigger failures in dma_direct_supported() as
> > > > the 'min_mask' isn't properly set.
> > > >
> > > > We create 'dma_direct_min_mask' in order for the arch init code to be
> > > > able to fine-tune dma direct's 'min_dma' mask.
> > >
> > > Normally we use ZONE_DMA for that case.
> >
> > Fair enough, I didn't think of that possibility.
> >
> > So would the arm64 maintainers be happy with something like this:
> >
> > - ZONE_DMA: Follows standard definition, 16MB in size. ARCH_ZONE_DMA_BITS is
> > left as is.
> > - ZONE_DMA32: Will honor the most constraining 'dma-ranges'. Which so far for
> > most devices is 4G, except for RPi4.
> > - ZONE_NORMAL: The rest of the memory.
> Never mind this suggestion, I don't think it makes any sense. If anything arm64
> seems to fit the ZONE_DMA usage pattern of arm and powerpc: where ZONE_DMA's
> size is decided based on ram size and/or board configuration. It was actually
> set-up like this until Christoph's ad67f5a6545f7 ("arm64: replace ZONE_DMA with
> ZONE_DMA32").
> So the easy solution would be to simply revert that commit. On one hand I feel
> it would be a step backwards as most 64 bit architectures have been moving to
> use ZONE_DMA32. On the other, current ZONE_DMA32 usage seems to be heavily
> rooted on having a 32 bit DMA mask*, which will no longer be the case on arm64
> if we want to support the RPi 4.
> So the way I see it and lacking a better solution, the argument is stronger on
> moving back arm64 to using ZONE_DMA. Any comments/opinions?

As it was suggested in this or the previous thread, I'm not keen on
limiting ZONE_DMA32 to the smalles RPi4 can cover, as the naming
implies this zone should cover 32-bit devices that can deal with a full
32-bit mask.

I think it may be better if we have both ZONE_DMA and ZONE_DMA32 on
arm64. ZONE_DMA would be based on the smallest dma-ranges as described
in the DT while DMA32 covers the first naturally aligned 4GB of RAM
(unchanged). When a smaller ZONE_DMA is not needed, it could be expanded
to cover what would normally be ZONE_DMA32 (or could we have ZONE_DMA as
0-bytes? I don't think GFP_DMA can still allocate memory in this case).

We'd probably have to define ARCH_ZONE_DMA_BITS for arm64 to something
smaller than 32-bit but sufficient to cover the known platforms like
RPi4 (the current 24 is too small, so maybe 30). AFAICT,
__dma_direct_optimal_gfp_mask() figures out whether GFP_DMA or GFP_DMA32
should be passed.