Re: [PATCH 1/4] Consolidate __memcpy_{to,from}io and __memset_io into a single lib

From: Arnd Bergmann
Date: Mon Sep 09 2024 - 04:37:10 EST


On Fri, Sep 6, 2024, at 11:41, Julian Vetter wrote:
> Various architectures have almost the same implementations for
> __memcpy_{to,from}io and __memset_io functions. So, consolidate them and
> introduce a CONFIG_GENERIC_IO flag to build the given lib/io.c.
>
> Reviewed-by: Yann Sionneau <ysionneau@xxxxxxxxxxxxx>
> Signed-off-by: Julian Vetter <jvetter@xxxxxxxxxxxxx>

Thanks for working on this. The implementation looks correct
to me and we should be able to use this on most architectures,
but since this is a shared file, it would be good to make it
as polished as possible.

> lib/Kconfig | 3 ++
> lib/Makefile | 2 +
> lib/io.c | 101 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 106 insertions(+)
> create mode 100644 lib/io.c

I feel the name is a little too geneal, maybe use io_copy.c and
CONFIG_GENERIC_IO_COPY for the name?

Alternatively, this could be part of lib/iomap_copy.c,
which already has some related helper functions. In that
case, I think we would rely on the per-architecture "#define
__memcpy_fromio __memcpy_fromio" etc macros instead of a
Kconfig symbol.

> +void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t count)
> +{
> + while (count && !IS_ALIGNED((unsigned long)from, NATIVE_STORE_SIZE)) {
> + *(u8 *)to = __raw_readb(from);
> + from++;
> + to++;
> + count--;
> + }

There is a corner case that this bit actually misses (same in the
existing implementation): If the __iomem buffer is aligned, but
the 'to' buffer is not, this may cause an alignment fault on
architectures without native unaligned stores. At the moment
I think both csky and some configurations of loongarch64 can
run into this. I think the best way to deal with this is to
use get_unaligned()/put_unaligned() in place of the pointer
dereference. This has no effect on architectures that have
native unaligned access but otherwise uses byte access on the
kernel buffer, which I think is a good tradeoff.

> + while (count >= NATIVE_STORE_SIZE) {
> + *(uintptr_t *)to = __raw_read_native(from);
> + from += NATIVE_STORE_SIZE;
> + to += NATIVE_STORE_SIZE;
> + count -= NATIVE_STORE_SIZE;
> + }

The use of __raw_read_native() somehow bothers me, it seems
a little more complicated than open-coding the two
possibles paths, or even using the aligned-access
helpers like

if (IS_ENABLED(CONFIG_64BIT))
__iowrite64_copy(to, from, count & WORD_MASK)
else
__iowrite32_copy(to, from, count & WORD_MASK)

Those helpers do require the kernel buffer to be naturally
aligned though.

> +void __memset_io(volatile void __iomem *dst, int c, size_t count)
> +{
> + uintptr_t qc = (u8)c;
> +
> + qc |= qc << 8;
> + qc |= qc << 16;
> +#if IS_ENABLED(CONFIG_64BIT)
> + qc |= qc << 32;
> +#endif

If you use IS_ENABLED() here, please do it like

if (IS_ENABLED(CONFIG_64BIT)
qc |= qc << 32;

Arnd