Re: [PATCH v3 1/3] riscv: optimized memcpy

From: Christoph Hellwig
Date: Mon Jun 21 2021 - 10:27:12 EST

Next message: Laurent Pinchart: "Re: [PATCH 1/3] drm/panel: Add connector_type and bus_format for AUO G104SN02 V2 panel"
Previous message: Russell King (Oracle): "Re: [PATCH net-next v4 03/10] net: sparx5: add hostmode with phylink support"
In reply to: kernel test robot: "Re: [PATCH v3 1/3] riscv: optimized memcpy"
Next in thread: David Laight: "RE: [PATCH v3 1/3] riscv: optimized memcpy"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Jun 17, 2021 at 05:27:52PM +0200, Matteo Croce wrote:
> +extern void *memcpy(void *dest, const void *src, size_t count);
> +extern void *__memcpy(void *dest, const void *src, size_t count);

No need for externs.

> +++ b/arch/riscv/lib/string.c

Nothing in her looks RISC-V specific. Why doesn't this go into lib/ so
that other architectures can use it as well.

> +#include <linux/module.h>

I think you only need export.h.

> +void *__memcpy(void *dest, const void *src, size_t count)
> +{
> + const int bytes_long = BITS_PER_LONG / 8;
> +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> + const int mask = bytes_long - 1;
> + const int distance = (src - dest) & mask;
> +#endif
> + union const_types s = { .u8 = src };
> + union types d = { .u8 = dest };
> +
> +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> + if (count < MIN_THRESHOLD)

Using IS_ENABLED we can avoid a lot of the mess in this
function.

int distance = 0;

if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
if (count < MIN_THRESHOLD)
goto copy_remainder;

/* copy a byte at time until destination is aligned */
for (; count && d.uptr & mask; count--)
*d.u8++ = *s.u8++;
distance = (src - dest) & mask;
}

if (distance) {
...

> + /* 32/64 bit wide copy from s to d.
> + * d is aligned now but s is not, so read s alignment wise,
> + * and do proper shift to get the right value.
> + * Works only on Little Endian machines.
> + */

Normal kernel comment style always start with a:

/*

> + for (next = s.ulong[0]; count >= bytes_long + mask; count -= bytes_long) {

Please avoid the pointlessly overlong line. And (just as a matter of
personal preference) I find for loop that don't actually use a single
iterator rather confusing. Wjy not simply:

next = s.ulong[0];
while (count >= bytes_long + mask) {
...
count -= bytes_long;
}

Next message: Laurent Pinchart: "Re: [PATCH 1/3] drm/panel: Add connector_type and bus_format for AUO G104SN02 V2 panel"
Previous message: Russell King (Oracle): "Re: [PATCH net-next v4 03/10] net: sparx5: add hostmode with phylink support"
In reply to: kernel test robot: "Re: [PATCH v3 1/3] riscv: optimized memcpy"
Next in thread: David Laight: "RE: [PATCH v3 1/3] riscv: optimized memcpy"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]