Re: [PoC] arm: dma-mapping: direct: Apply dma_pfn_offset only when it is valid

From: Christoph Hellwig
Date: Mon Feb 03 2020 - 12:08:14 EST


On Fri, Jan 31, 2020 at 04:00:20PM +0200, Peter Ujfalusi wrote:
> I see. My PoC patch was not too off then ;)
> So the plan is to have a generic implementation for all of the
> architecture, right?

Ð don't know of a concrete plan, but that's defintively what I'd like
to see.

> >> The dma_pfn_offset is _still_ applied to the mask we are trying to set
> >> (and validate) via dma-direct.
> >
> > And for the general case that is exactly the right thing to do, we
> > just need to deal with really odd ZONE_DMA placements like yours.
>
> I'm still not convinced, the point of the DMA mask, at least how I see
> it, to check that the dma address can be handled by the device (DMA,
> peripheral with built in DMA, etc), it is not against physical address.
> Doing phys_to_dma() on the mask from the dma_set_mask() is just wrong.

We have a translation between the addresses that the device sees, and
those that the CPU sees. The device can address N bits of address space
as seen from the device. The addresses encoded in max_pfn,
zone_dma_bits or the harcoded 32 in the zone dma 32 case are CPU address.
So no, we can't blindly compare those.


> > But that will cause yet another regression in what we have just fixed
> > with using the generic direct ops, at which points it turns into who
> > screams louder.
>
> Hehe, I see.
> I genuinely curious why k2 platform worked just fine with LPAE (it needs
> it), but guys had issues with LPAE on dra7/am5.
> The fix for dra7/am5 broke k2.
> As far as I can see the main (only) difference is that k2 have
> dma_pfn_offset = 0x780000, while dra7/am5 have it 0 (really direct mapping).

How much memory does the platform have? Once you are above 32-bits worth
of address space devices with a 32-bit DMA mask can't address all the
memory. Now if k2 for example only had less than 4G of memory, but at
addresses over 4G, and the offset compensates for the offset of the DRAM
it works without bounce buffering and thus didn't need swiotlb. But any
platform that has DRAM that is not addressable will need swiotlb.

> > u64 min_mask;
> >
> > + if (mask >= DMA_BIT_MASK(32))
> > + return 1;
> > +
>
> Right, so skipping phys_to_dma() for the mask and believing that it will
> work..
>
> It does: audio and dmatest memcpy tests are just fine with this, MMC
> also probed with ADMA enabled.
>
> As far as I can tell it works as well as falling back to the old arm ops
> in case of LPAE && dma_pfn_offset != 0
>
> Fwiw:
> Tested-by: Peter Ujfalusi <peter.ujfalusi@xxxxxx>
>
> Would you be comfortable to send this patch for mainline with
> Fixes: ad3c7b18c5b3 ("arm: use swiotlb for bounce buffering on LPAE
> configs")

That is the big question. I don't feel overly comfortable as I've been
trying to get this right, but so far it seems like the least bad option.
I'll send out a proper patch with updated comments and will see what
people think.