Re: [PATCH] arm64: mm: take CWG into account in __inval_cache_range()

From: Ard Biesheuvel
Date: Tue Apr 19 2016 - 11:39:04 EST


On 19 April 2016 at 17:32, Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
> On Tue, Apr 19, 2016 at 04:48:32PM +0200, Ard Biesheuvel wrote:
>> On 19 April 2016 at 16:13, Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
>> > The best we could do is to warn if ARCH_DMA_MINALIGN is smaller than CWG
>> > (as Robin suggested, we could do this only if we have non-coherent DMA
>> > masters via arch_setup_dma_ops()). Quick hack below:
>> >
>> > -------------------------------8<-----------------------------------
>> > diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
>> > index 5082b30bc2c0..5967fcbb617a 100644
>> > --- a/arch/arm64/include/asm/cache.h
>> > +++ b/arch/arm64/include/asm/cache.h
>> > @@ -28,7 +28,7 @@
>> > * cache before the transfer is done, causing old data to be seen by
>> > * the CPU.
>> > */
>> > -#define ARCH_DMA_MINALIGN L1_CACHE_BYTES
>> > +#define ARCH_DMA_MINALIGN 128
>> >
>> > #ifndef __ASSEMBLY__
>> >
>> > @@ -37,7 +37,7 @@
>> > static inline int cache_line_size(void)
>> > {
>> > u32 cwg = cache_type_cwg();
>> > - return cwg ? 4 << cwg : L1_CACHE_BYTES;
>> > + return cwg ? 4 << cwg : ARCH_DMA_MINALIGN;
>>
>> Unrelated, but this does not look right: if the CWG field is zero, we
>> should either assume 2 KB, or iterate over all the CCSIDR values and
>> take the maximum linesize.
>
> It may be a better guess but even that is not always relevant since
> CCSIDR may not present the real hardware information. It's only meant to
> give enough information to be able to do cache maintenance by set/way
> and we've seen CPU implementations where this has nothing to do with the
> actual cache geometry.
>

I am aware of that discussion, but that was about inferring aliasing
properties from the way size, which combines the linesize and the
number of sets/ways, and the latter are apparently set to 1/1 in some
cases so that any set/way operation simply affects the entire cache.

However, the CCSIDR linesize field itself is mentioned in the
description of CWG in the ARM ARM, as a suitable source of obtaining
the maximum linesize in the system.

> So I don't think we can do anything more than just hard-coding and hope
> that implementations where CWG is 0 (or higher than 128) are only
> deployed in a fully coherent configuration.
>