Re: [PATCH 16/52] [microblaze] supported function for memory -kernel/lib

From: Jan Engelhardt
Date: Tue Jan 29 2008 - 16:26:24 EST



On Jan 24 2008 16:02, monstr@xxxxxxxxx wrote:
>diff --git a/arch/microblaze/lib/memcpy.c b/arch/microblaze/lib/memcpy.c
>new file mode 100644
>index 0000000..cc13ebd
>--- /dev/null
>+++ b/arch/microblaze/lib/memcpy.c
>@@ -0,0 +1,159 @@
>+/* Filename: memcpy.c
>+ *

Please, no such filenames in files. It is redundant and hard to keep
uptodate, should things move at one point in time.

>+#define BYTE_BLIT_STEP(d, s, h, o) \
>+ { register unsigned _v_; _v_ = *((unsigned *)(s))++; \
>+ *((unsigned *)(d))++ = (h) | _v_ >> (32-(o)); \
>+ (h) = _v_ << (o); \
>+ }

'register' is a relict from 30 years ago. Compilers likely ignore it
these days, so it can jsut be removed anyway.

>+ /* This code was put into place as we transitioned from gcc3 to gcc4.
>+ * gcc4 does not support the syntax *((char *)d)++ which causes the

Of course not, because (char *)d is not an lvalue - and if you ask
me, it is ambiguous and bogus code.

Assuming 'd' was int*, d++ would cause it to be incremented by 4.

If you incremented 'd' with char* semantics, it would be incremented
by 1. But since 'd' is still int*, you now have an unaligned pointer,
which is a problem in itself. We do not need more pitfalls than C
already provides.

>+ if (c >= 4) {
>+ unsigned x, a, h, align;

You will be wanting to use unsigned long since you are working with
pointers.

>+ /* Align the destination to a word boundry. */
>+ /* This is done in an endian independant manner. */
>+ switch ((unsigned) d & 3) {
>+ case 1:
>+ *((char *) d)++ = *((char *) s)++;
>+ c--;
>+ case 2:
>+ *((char *) d)++ = *((char *) s)++;
>+ c--;
>+ case 3:
>+ *((char *) d)++ = *((char *) s)++;
>+ c--;
>+ }

This gets my strongest NAK. Please rewrite it in a sane manner.
In other words, like this:

+void *memcpy(void *v_dest, const void *v_src, __kernel_size_t c)
+{
+ const char *src = v_src;
+ char *dest = v_dest;
+
+ const uint32_t *i_src;
+ uint32_t *i_dest;
+
+ if (c >= 4) {
+ unsigned x, a, h, align;
+
+ /* Align the destination to a word boundry. */
+ /* This is done in an endian independant manner. */
+ switch ((unsigned long)dest & 3) {
+ case 1:
+ *dest++ = *src++;
+ --c;
+ case 2:
+ *dest++ = *src++;
+ --c;
+ case 3:
+ *dest++ = *src++;
+ --c;
+ }
+ /* Choose a copy scheme based on the source */
+ /* alignment relative to destination. */
+ switch ((unsigned long)src & 3) {
+ case 0x0: /* Both byte offsets are aligned */
+
+ i_src = (const void *)src;
+ i_dest = (void *)dest;
+
+ for (; c >= 4; c -= 4)
+ *i_dest++ = *i_src++;
+
+ src = (const void *)i_src;
+ dest = (void *)i_dest;
+
+ break;
+
+ case 0x1: /* Unaligned - Off by 1 */
...

The BYTE_BLIT_ macros need reworking. The suggestions
so far should provide everything needed.

+ }
+
+ }
+
+ /* Finish off any remaining bytes */
+ /* simple fast copy, ... unless a cache boundry is crossed */
+ switch (c) {
+ case 3:
+ *dest++ = *src++;
+ case 2:
+ *dest++ = *src++;
+ case 1:
+ *dest++ = *src++;
+ }
+
+ return r;
+#endif
+}

- and voilÃ, you can use it even with gcc4.
The same goes for memmove.c and memset.c.

By the way, I think you should just replace lib/string.c's memcpy,
etc. (which still do per-byte copying) with this optimized variant
instead, so everyone can benefit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/