Re: [RFC PATCH v6 1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management

From: Conor Dooley
Date: Fri Jan 06 2023 - 18:48:50 EST


+CC Icenowy

Hey Prabhakar,

On Fri, Jan 06, 2023 at 06:55:21PM +0000, Prabhakar wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx>
>
> The current implementation of CMO was handled using the ALTERNATIVE_X()
> macro; this was manageable when there were a limited number of platforms
> using this. Now that we are having more and more platforms coming through
> with the CMO the use of the ALTERNATIVE_X() macro becomes unmanageable.

Nitpick time! This opening paragraph is all a little oddly worded IMO.
How about:

Currently, selecting which CMOs to use on a given platform is done using
and ALTERNATIVE_X() macro. This was manageable when there were just two
CMO implementations, but now that there are more and more platforms coming
needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.

> To avoid such issues this patch switches to use of function pointers
> instead of ALTERNATIVE_X() macro for cache management (the only draw being

s/draw/drawback

> performance over the previous approach).

Did you notice a performance decrease or are you speculating?
Perhaps Icenowy - who I know benched their implmentation of a new CMO
macro for THEAD stuff can do so again for this.

> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);
>
> The above function pointers are provided to be overridden where platforms
> using standard approach and for platforms who want handle the operation
> based on the operation the below function pointer is provided:

Think you've left out some words here chief!
Perhaps s/and for platforms who/. For platforms that/ & on the line
after, s/operation/direction/ ?

> void (*riscv_dma_noncoherent_cmo_ops)(void *vaddr, size_t size,
> enum dma_data_direction dir,
> enum dma_noncoherent_ops ops);
>
> In the current patch we have moved the ZICBOM and T-Head CMO to use
> function pointers.

"Convert ZICBOM..."

> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx>
> ---
> v5->v6
> * New patch
> ---
> arch/riscv/errata/thead/errata.c | 71 ++++++++++++++++++
> arch/riscv/include/asm/dma-noncoherent.h | 83 +++++++++++++++++++++
> arch/riscv/include/asm/errata_list.h | 53 -------------
> arch/riscv/kernel/cpufeature.c | 2 +
> arch/riscv/mm/dma-noncoherent.c | 94 ++++++++++++++++++++++--
> arch/riscv/mm/pmem.c | 18 ++++-
> 6 files changed, 260 insertions(+), 61 deletions(-)
> create mode 100644 arch/riscv/include/asm/dma-noncoherent.h
>
> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> index fac5742d1c1e..826b2ba3e61e 100644
> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c
> @@ -8,12 +8,72 @@
> #include <linux/module.h>
> #include <linux/string.h>
> #include <linux/uaccess.h>
> +#include <asm/dma-noncoherent.h>
> #include <asm/alternative.h>
> #include <asm/cacheflush.h>
> #include <asm/errata_list.h>
> #include <asm/patch.h>
> #include <asm/vendorid_list.h>
>
> +#ifdef CONFIG_ERRATA_THEAD_CMO
> +/*
> + * dcache.ipa rs1 (invalidate, physical address)
> + * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
> + * 0000001 01010 rs1 000 00000 0001011
> + * dache.iva rs1 (invalida, virtual address)
> + * 0000001 00110 rs1 000 00000 0001011
> + *
> + * dcache.cpa rs1 (clean, physical address)
> + * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
> + * 0000001 01001 rs1 000 00000 0001011
> + * dcache.cva rs1 (clean, virtual address)
> + * 0000001 00100 rs1 000 00000 0001011
> + *
> + * dcache.cipa rs1 (clean then invalidate, physical address)
> + * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
> + * 0000001 01011 rs1 000 00000 0001011
> + * dcache.civa rs1 (... virtual address)
> + * 0000001 00111 rs1 000 00000 0001011
> + *
> + * sync.s (make sure all cache operations finished)
> + * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
> + * 0000000 11001 00000 000 00000 0001011
> + */
> +#define THEAD_inval_A0 ".long 0x0265000b"
> +#define THEAD_clean_A0 ".long 0x0245000b"
> +#define THEAD_flush_A0 ".long 0x0275000b"
> +#define THEAD_SYNC_S ".long 0x0190000b"
> +
> +#define THEAD_CMO_OP(_op, _start, _size, _cachesize) \
> + asm volatile("mv a0, %1\n\t" \
> + "j 2f\n\t" \
> + "3:\n\t" \
> + THEAD_##_op##_A0 "\n\t" \
> + "add a0, a0, %0\n\t" \
> + "2:\n\t" \
> + "bltu a0, %2, 3b\n\t" \
> + THEAD_SYNC_S \
> + : : "r"(_cachesize), \
> + "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
> + "r"((unsigned long)(_start) + (_size)) \
> + : "a0")

I'm not going to provide a tested-by for the THEAD stuff, as I have no
metrics - but I booted my usual NFS and did some tyre kicking. Seemed
grand.

> +static void thead_cmo_clean_range(unsigned long addr, unsigned long size)
> +{
> + THEAD_CMO_OP(clean, addr, size, riscv_cbom_block_size);
> +}
> +
> +static void thead_cmo_flush_range(unsigned long addr, unsigned long size)
> +{
> + THEAD_CMO_OP(flush, addr, size, riscv_cbom_block_size);
> +}
> +
> +static void thead_cmo_inval_range(unsigned long addr, unsigned long size)
> +{
> + THEAD_CMO_OP(inval, addr, size, riscv_cbom_block_size);
> +}
> +#endif
> +
> static bool errata_probe_pbmt(unsigned int stage,
> unsigned long arch_id, unsigned long impid)
> {
> @@ -33,6 +93,8 @@ static bool errata_probe_pbmt(unsigned int stage,
> static bool errata_probe_cmo(unsigned int stage,
> unsigned long arch_id, unsigned long impid)
> {
> + struct riscv_cache_ops thead_cmo_ops;
> +
> if (!IS_ENABLED(CONFIG_ERRATA_THEAD_CMO))
> return false;
>
> @@ -44,6 +106,15 @@ static bool errata_probe_cmo(unsigned int stage,
>
> riscv_cbom_block_size = L1_CACHE_BYTES;
> riscv_noncoherent_supported();
> +
> + memset(&thead_cmo_ops, 0x0, sizeof(thead_cmo_ops));
> + if (IS_ENABLED(CONFIG_ERRATA_THEAD_CMO)) {

With CONFIG_ERRATA_THEAD_CMO=n:

/stuff/linux/arch/riscv/errata/thead/errata.c: In function 'errata_probe_cmo':
/stuff/linux/arch/riscv/errata/thead/errata.c:112:46: error: 'thead_cmo_clean_range' undeclared (first use in this function)
112 | thead_cmo_ops.clean_range = &thead_cmo_clean_range;
| ^~~~~~~~~~~~~~~~~~~~~
/stuff/linux/arch/riscv/errata/thead/errata.c:112:46: note: each undeclared identifier is reported only once for each function it appears in
/stuff/linux/arch/riscv/errata/thead/errata.c:113:44: error: 'thead_cmo_inval_range' undeclared (first use in this function)
113 | thead_cmo_ops.inv_range = &thead_cmo_inval_range;
| ^~~~~~~~~~~~~~~~~~~~~
/stuff/linux/arch/riscv/errata/thead/errata.c:114:46: error: 'thead_cmo_flush_range' undeclared (first use in this function)
114 | thead_cmo_ops.flush_range = &thead_cmo_flush_range;
| ^~~~~~~~~~~~~~~~~~~~~

These functions are only present if CONFIG_ERRATA_THEAD_CMO is set,
so this cannot be an IS_ENABLED() as things stand.

> + thead_cmo_ops.clean_range = &thead_cmo_clean_range;
> + thead_cmo_ops.inv_range = &thead_cmo_inval_range;
> + thead_cmo_ops.flush_range = &thead_cmo_flush_range;

As Arnd suggested, I'd rather see a `static const struct
riscv_cache_ops` type deal too, so you can just call register()
directly.

> + riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
> + }
> +
> return true;
> }
>
> diff --git a/arch/riscv/include/asm/dma-noncoherent.h b/arch/riscv/include/asm/dma-noncoherent.h
> new file mode 100644
> index 000000000000..a2af863d2608
> --- /dev/null
> +++ b/arch/riscv/include/asm/dma-noncoherent.h
> @@ -0,0 +1,83 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2022 Renesas Electronics Corp.
> + */
> +
> +#ifndef __ASM_DMA_NONCOHERENT_H
> +#define __ASM_DMA_NONCOHERENT_H
> +
> +#include <linux/dma-direct.h>
> +
> +enum dma_noncoherent_ops {
> + NON_COHERENT_SYNC_DMA_FOR_DEVICE = 0,
> + NON_COHERENT_SYNC_DMA_FOR_CPU,
> + NON_COHERENT_DMA_PREP,
> + NON_COHERENT_DMA_PMEM,
> +};
> +
> +/*
> + * struct riscv_cache_ops - Structure for CMO function pointers

Drop the "function pointers" from everything here IMO.

> + * @clean_range: Function pointer for clean cache
> + * @inv_range: Function pointer for invalidate cache
> + * @flush_range: Function pointer for flushing the cache
> + * @riscv_dma_noncoherent_cmo_ops: Function pointer for platforms who want
> + * to handle CMO themselves. If this function pointer is set rest of the
> + * function pointers will be NULL.

Will be NULL? Or must be NULL?
Assuming you keep it, as I see Arnd is questioning it, I think a better
description is needed for this one. And a rename of the function name,
the one you have right now is oddly juxtaposed and a bit confusing.
I don't really have something much better to suggest, so I am gonna use
cmo_omnibus here...

@cmo_omnibus: For platforms whose CMO capabilities do not align with the
standard cache management operations. If set, the specific
ops must be left unset.

Circling back ages later, cmo_universal()?

> +struct riscv_cache_ops {
> + void (*clean_range)(unsigned long addr, unsigned long size);
> + void (*inv_range)(unsigned long addr, unsigned long size);
> + void (*flush_range)(unsigned long addr, unsigned long size);
> + void (*riscv_dma_noncoherent_cmo_ops)(void *vaddr, size_t size,
> + enum dma_data_direction dir,
> + enum dma_noncoherent_ops ops);
> +};
> +
> +extern struct riscv_cache_ops zicbom_cmo_ops;
> +
> +#ifdef CONFIG_RISCV_DMA_NONCOHERENT
> +
> +extern struct riscv_cache_ops noncoherent_cache_ops;
> +
> +void riscv_noncoherent_register_cache_ops(struct riscv_cache_ops *ops);
> +
> +static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t size)
> +{
> + if (noncoherent_cache_ops.clean_range) {
> + unsigned long addr = (unsigned long)vaddr;
> +
> + noncoherent_cache_ops.clean_range(addr, size);
> + }
> +}
> +
> +static inline void riscv_dma_noncoherent_flush(void *vaddr, size_t size)
> +{
> + if (noncoherent_cache_ops.flush_range) {
> + unsigned long addr = (unsigned long)vaddr;
> +
> + noncoherent_cache_ops.flush_range(addr, size);
> + }
> +}
> +
> +static inline void riscv_dma_noncoherent_inval(void *vaddr, size_t size)
> +{
> + if (noncoherent_cache_ops.inv_range) {
> + unsigned long addr = (unsigned long)vaddr;
> +
> + noncoherent_cache_ops.inv_range(addr, size);
> + }
> +}
> +
> +#else
> +
> +static void riscv_noncoherent_register_cache_ops(struct riscv_cache_ops *ops) {}
> +
> +static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t size) {}
> +
> +static inline void riscv_dma_noncoherent_flush(void *vaddr, size_t size) {}
> +
> +static inline void riscv_dma_noncoherent_inval(void *vaddr, size_t size) {}

Can these ever be used if DMA_NONCOHERENT is not set? I think
riscv/mm/Makefile gates their usage on the symbol, no?

> +
> +#endif
> +
> +#endif /* __ASM_DMA_NONCOHERENT_H */
> diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> index 4180312d2a70..ae3fc8b80edd 100644

[...]

> diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
> index d919efab6eba..d9445c266bfd 100644
> --- a/arch/riscv/mm/dma-noncoherent.c
> +++ b/arch/riscv/mm/dma-noncoherent.c
> @@ -9,23 +9,82 @@
> #include <linux/dma-map-ops.h>
> #include <linux/mm.h>
> #include <asm/cacheflush.h>
> +#include <asm/dma-noncoherent.h>
>
> static bool noncoherent_supported;
>
> +struct riscv_cache_ops noncoherent_cache_ops = {
> + .clean_range = NULL,
> + .inv_range = NULL,
> + .flush_range = NULL,
> + .riscv_dma_noncoherent_cmo_ops = NULL,
> +};
> +EXPORT_SYMBOL(noncoherent_cache_ops);

These exports should be _GPL, no? The file is GPLed. Ditto the others.

> +#ifdef CONFIG_RISCV_ISA_ZICBOM
> +#define ZICBOM_CMO_OP(_op, _start, _size, _cachesize) \
> + asm volatile("mv a0, %1\n\t" \
> + "j 2f\n\t" \
> + "3:\n\t" \
> + "cbo." __stringify(_op) " (a0)\n\t" \
> + "add a0, a0, %0\n\t" \
> + "2:\n\t" \
> + "bltu a0, %2, 3b\n\t" \
> + : : "r"(_cachesize), \
> + "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
> + "r"((unsigned long)(_start) + (_size)) \
> + : "a0")
> +
> +static void zicbom_cmo_clean_range(unsigned long addr, unsigned long size)
> +{
> + ZICBOM_CMO_OP(clean, addr, size, riscv_cbom_block_size);
> +}
> +
> +static void zicbom_cmo_flush_range(unsigned long addr, unsigned long size)
> +{
> + ZICBOM_CMO_OP(flush, addr, size, riscv_cbom_block_size);
> +}
> +
> +static void zicbom_cmo_inval_range(unsigned long addr, unsigned long size)
> +{
> + ZICBOM_CMO_OP(inval, addr, size, riscv_cbom_block_size);
> +}
> +
> +struct riscv_cache_ops zicbom_cmo_ops = {
> + .clean_range = &zicbom_cmo_clean_range,
> + .inv_range = &zicbom_cmo_inval_range,
> + .flush_range = &zicbom_cmo_flush_range,
> +};
> +#else
> +struct riscv_cache_ops zicbom_cmo_ops = {
> + .clean_range = NULL,
> + .inv_range = NULL,
> + .flush_range = NULL,
> + .riscv_dma_noncoherent_cmo_ops = NULL,
> +};
> +#endif
> +EXPORT_SYMBOL(zicbom_cmo_ops);
> +
> void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
> enum dma_data_direction dir)
> {
> void *vaddr = phys_to_virt(paddr);
>
> + if (noncoherent_cache_ops.riscv_dma_noncoherent_cmo_ops) {
> + noncoherent_cache_ops.riscv_dma_noncoherent_cmo_ops(vaddr, size, dir,
> + NON_COHERENT_SYNC_DMA_FOR_DEVICE);
> + return;
> + }
> +
> switch (dir) {
> case DMA_TO_DEVICE:
> - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
> + riscv_dma_noncoherent_clean(vaddr, size);
> break;
> case DMA_FROM_DEVICE:
> - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
> + riscv_dma_noncoherent_clean(vaddr, size);
> break;
> case DMA_BIDIRECTIONAL:
> - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
> + riscv_dma_noncoherent_flush(vaddr, size);
> break;
> default:
> break;
> @@ -37,12 +96,18 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
> {
> void *vaddr = phys_to_virt(paddr);
>
> + if (noncoherent_cache_ops.riscv_dma_noncoherent_cmo_ops) {
> + noncoherent_cache_ops.riscv_dma_noncoherent_cmo_ops(vaddr, size, dir,
> + NON_COHERENT_SYNC_DMA_FOR_CPU);
> + return;
> + }
> +
> switch (dir) {
> case DMA_TO_DEVICE:
> break;
> case DMA_FROM_DEVICE:
> case DMA_BIDIRECTIONAL:
> - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
> + riscv_dma_noncoherent_flush(vaddr, size);
> break;
> default:
> break;
> @@ -53,7 +118,13 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
> {
> void *flush_addr = page_address(page);
>
> - ALT_CMO_OP(flush, flush_addr, size, riscv_cbom_block_size);
> + if (noncoherent_cache_ops.riscv_dma_noncoherent_cmo_ops) {
> + noncoherent_cache_ops.riscv_dma_noncoherent_cmo_ops(flush_addr, size, -1,
> + NON_COHERENT_DMA_PREP);
> + return;
> + }
> +
> + riscv_dma_noncoherent_flush(flush_addr, size);
> }
>
> void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> @@ -78,3 +149,16 @@ void riscv_noncoherent_supported(void)
> "Non-coherent DMA support enabled without a block size\n");
> noncoherent_supported = true;
> }

Barring the function name, which I really don't like - it really is
confusingly named. Something like cmo_universal() would get it's point
across a bit better I think.

> +void riscv_noncoherent_register_cache_ops(struct riscv_cache_ops *ops)
> +{
> + if (!ops)
> + return;
> +
> + if (ops->riscv_dma_noncoherent_cmo_ops)
> + noncoherent_cache_ops.riscv_dma_noncoherent_cmo_ops =
> + ops->riscv_dma_noncoherent_cmo_ops;
> + else
> + noncoherent_cache_ops = *ops;

Can we just chop off the extra step here? Behaviourally this is going to
be no different whether someone sets the universal op and the individual
ones anyway.

> +}
> +EXPORT_SYMBOL(riscv_noncoherent_register_cache_ops);

_GPL again I think!

I'm happy with this approach, modulo the minor bits I've pointed out.
I do really like that it takes us away from messing with an asm
alternative every time someone wants to do this stuff differently or for
a new platform.

I would like Heiko to have a look at what you've done here, but ye looks
promising IMO.

Thanks,
Conor.

Attachment: signature.asc
Description: PGP signature