Re: [PATCH] sh: Remove IO memcpy and memset from sh code
From: Arnd Bergmann
Date: Tue Jan 28 2025 - 04:25:51 EST
On Tue, Jan 28, 2025, at 09:42, Julian Vetter wrote:
> Remove IO memcpy and memset from sh specific code and fall back to the
> new implementation from lib/iomem_copy.c. It uses word accesses if the
> buffers are aligned and only falls back to byte accesses for potentially
> unaligned parts of a buffer. Keep only the SH4 optimized memcpy_fromio.
>
> Signed-off-by: Julian Vetter <julian@xxxxxxxxxxxxxxxx>
This looks good in pinciple, but I see one mistake:
> +#ifdef CONFIG_CPU_SH4
> +void memcpy_fromio(void *to, const volatile void __iomem *from, size_t
> count)
> {
> /*
> * Would it be worthwhile doing byte and long transfers first
> * to try and get aligned?
> */
> -#ifdef CONFIG_CPU_SH4
> if ((count >= 0x20) &&
> (((u32)to & 0x1f) == 0) && (((u32)from & 0x3) == 0)) {
> int tmp2, tmp3, tmp4, tmp5, tmp6;
> @@ -53,59 +50,6 @@ void memcpy_fromio(void *to, const volatile void
> __iomem *from, unsigned long co
> : "7"(from), "0" (to), "1" (count)
> : "r0", "r7", "t", "memory");
> }
> -#endif
> -
> - if ((((u32)to | (u32)from) & 0x3) == 0) {
> - for (; count > 3; count -= 4) {
> - *(u32 *)to = *(volatile u32 *)from;
> - to += 4;
> - from += 4;
> - }
> - }
> -
The SH4 version still needs the bottom of the function to
handle data that is not a multiple of 32 bytes long.
I would expect gcc to produce a properly optimized
version for sh4 from the generic code as well, so I would
suggest you remove it entirely and rely on the common code
here.
Arnd