Re: [PATCH v3 1/3] riscv: optimized memcpy

From: Christoph Hellwig
Date: Mon Jun 21 2021 - 10:27:12 EST


On Thu, Jun 17, 2021 at 05:27:52PM +0200, Matteo Croce wrote:
> +extern void *memcpy(void *dest, const void *src, size_t count);
> +extern void *__memcpy(void *dest, const void *src, size_t count);

No need for externs.

> +++ b/arch/riscv/lib/string.c

Nothing in her looks RISC-V specific. Why doesn't this go into lib/ so
that other architectures can use it as well.

> +#include <linux/module.h>

I think you only need export.h.

> +void *__memcpy(void *dest, const void *src, size_t count)
> +{
> + const int bytes_long = BITS_PER_LONG / 8;
> +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> + const int mask = bytes_long - 1;
> + const int distance = (src - dest) & mask;
> +#endif
> + union const_types s = { .u8 = src };
> + union types d = { .u8 = dest };
> +
> +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> + if (count < MIN_THRESHOLD)

Using IS_ENABLED we can avoid a lot of the mess in this
function.

int distance = 0;

if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
if (count < MIN_THRESHOLD)
goto copy_remainder;

/* copy a byte at time until destination is aligned */
for (; count && d.uptr & mask; count--)
*d.u8++ = *s.u8++;
distance = (src - dest) & mask;
}

if (distance) {
...

> + /* 32/64 bit wide copy from s to d.
> + * d is aligned now but s is not, so read s alignment wise,
> + * and do proper shift to get the right value.
> + * Works only on Little Endian machines.
> + */

Normal kernel comment style always start with a:

/*


> + for (next = s.ulong[0]; count >= bytes_long + mask; count -= bytes_long) {

Please avoid the pointlessly overlong line. And (just as a matter of
personal preference) I find for loop that don't actually use a single
iterator rather confusing. Wjy not simply:

next = s.ulong[0];
while (count >= bytes_long + mask) {
...
count -= bytes_long;
}