Re: [PATCH 07/10] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN

From: Ard Biesheuvel
Date: Tue Apr 12 2022 - 19:17:33 EST


On Tue, 12 Apr 2022 at 14:31, Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
>
> On Tue, Apr 12, 2022 at 06:18:46PM +0800, Herbert Xu wrote:
> > On Tue, Apr 12, 2022 at 11:02:54AM +0100, Catalin Marinas wrote:
> > > This series does not penalise any architecture. It doesn't even make
> > > arm64 any worse than it currently is.
> >
> > Right, the patch as it stands doesn't change anything. However,
> > it is also broken as it stands. As I said before, CRYPTO_MINALIGN
> > is not something that is guaranteed by the Crypto API, it is simply
> > a statement of whatever kmalloc returns.
>
> I agree that CRYPTO_MINALIGN is not guaranteed by the Crypto API. What
> I'm debating is the intended use for CRYPTO_MINALIGN in some (most?) of
> the drivers. It's not just about kmalloc() but also a build-time offset
> of buffers within structures to guarantee DMA safety. This can't be
> fixed by cra_alignmask.
>
> We could leave CRYPTO_MINALIGN as ARCH_KMALLOC_MINALIGN and that matches
> it being just a statement of the kmalloc() minimum alignment. But since
> it is also overloaded with the DMA in-structure offset alignment, we'd
> need a new CRYPTO_DMA_MINALIGN (and _ATTR) to annotate those structures.
> I have a suspicion there'll be fewer of the original CRYPTO_MINALIGN
> uses left, hence my approach to making this bigger from the start.
>
> There's also Ard's series introducing CRYPTO_REQ_MINALIGN while leaving
> CRYPT_MINALIGN for DMA-safe offsets (IIUC):
>
> https://lore.kernel.org/r/20220406142715.2270256-1-ardb@xxxxxxxxxx
>
> > So if kmalloc is no longer returning CRYPTO_MINALIGN-aligned
> > memory, then those drivers that need this alignment for DMA
> > will break anyway.
>

One thing to note here is that minimum DMA *alignment* is not the same
as the padding to cache writeback granule (CWG) that is needed when
dealing with non-cache coherent inbound DMA.

The former is typically a property of the peripheral IP, and is
something that the driver needs to deal with, potentially by setting
cra_alignmask to ensure that the input and output buffers are placed
such that they can accessed via DMA by the peripheral.

The latter is a property of the CPU's cache hierarchy, not only the
size of the CWG, but also whether or not DMA is cache coherent to
begin with. This is not something the driver should usually have to
care about if it uses the DMA API correctly.

The reason why CRYPTO_MINALIGN matters for DMA in spite of this is
that some drivers not only use DMA for processing the bulk of the data
(typically presented using scatterlists) but sometimes also use DMA to
map parts of the request and TFM structures, which carry control data
used by the CPU to manage the crypto requests. Doing a non-coherent
DMA write into such a structure may blow away 64 or 128 bytes of data,
even if the write itself is much smaller, due to the fact that we need
to perform cache invalidation in order for the CPU to be able to
observe what the device wrote to that memory, and the invalidated
cache lines may be shared with other data items, and may become dirty
while the DMA mapping is active.

This is what I am addressing with my patch series, i.e., padding out
the driver owned parts of the struct to the CWG size so that cache
invalidation does not affect data owned by other layers in the crypto
cake, but only at runtime. By doing this consistently for TFM and
request structures, we should be able to disregard ARCH_DMA_MINALIGN
entirely when it comes to defining CRYPTO_MINALIGN, as it is highly
unlikely that any of these peripherals would require more than 8 or 16
bytes of alignment for the DMA operations themselves.




> No. As per one of my previous emails, kmalloc() will preserve the DMA
> alignment for an SoC even if smaller than CRYPTO_MINALIGN (or a new
> CRYPTO_DMA_MINALIGN). Since kmalloc() returns DMA-safe pointers and
> CRYPTO_MINALIGN (or a new CRYPTO_DMA_MINALIGN) is DMA-safe, so would an
> offset from a pointer returned by kmalloc().
>
> > If you want the Crypto API to guarantee alignment over and above
> > that returned by kmalloc, the correct way is to use cra_alignmask.
>
> For kmalloc(), this would work, but for the current CRYPTO_MINALIGN_ATTR
> uses it won't.
>
> Thanks.
>
> --
> Catalin