Re: [RFC PATCH 00/25] Upstream kvx Linux port
From: Arnd Bergmann
Date: Wed Jan 04 2023 - 10:59:48 EST
On Tue, Jan 3, 2023, at 17:43, Yann Sionneau wrote:
> This patch series adds support for the kv3-1 CPU architecture of the kvx family
> found in the Coolidge (aka MPPA3-80) SoC of Kalray.
>
> This is an RFC, since kvx support is not yet upstreamed into gcc/binutils,
> therefore this patch series cannot be merged into Linux for now.
>
> The goal is to have preliminary reviews and to fix problems early.
>
> The Kalray VLIW processor family (kvx) has the following features:
> * 32/64 bits execution mode
> * 6-issue VLIW architecture
> * 64 x 64bits general purpose registers
> * SIMD instructions
> * little-endian
> * deep learning co-processor
Thanks for posting these, I had been wondering about the
state of the port. Overall this looks really nice, I can
see that you and the team have looked at other ports
and generally made the right decisions.
I commented on the syscall patch directly, I think it's
important to stop using the deprecated syscalls as soon
as possible to avoid having dependencies in too many
libc binaries. Almost everything else can be changed
easily as you get closer to upstream inclusion.
I did not receive most of the other patches as I'm
not subscribed to all the mainline lists. For future
submissions, can you add the linux-arch list to Cc for
all patches?
Reading the rest of the series through lore.kernel.org,
most of the comments I have are for improvements that
you may find valuable rather than serious mistakes:
- the {copy_to,copy_from,clear}_user functions are
well worth optimizing better than the byte-at-a-time
version you have, even just a C version built around
your __get_user/__put_user inline asm should help, and
could be added to lib/usercopy.c.
- The __raw_{read,write}{b,w,l,q} helpers should
normally be defined as inline asm instead of
volatile pointer dereferences, I've seen cases where
the compiler ends up splitting the access or does
other things you may not want on MMIO areas.
- I would recomment implementing HAVE_ARCH_VMAP_STACK
as well as IRQ stacks, both of these help to
avoid data corruption from stack overflow that you
will eventually run into.
- You use qspinlock as the only available spinlock
implementation, but only support running on a
single cluster of 16 cores. It may help to use
the generic ticket spinlock instead, or leave it
as a Kconfig option, in particular since you only
have the emulated xchg16() atomic for qspinlock.
- Your defconfig file enables CONFIG_EMBEDDED, which
in turn enables CONFIG_EXPERT. This is probably
not what you want, so better turn off both of these.
- The GENERIC_CALIBRATE_DELAY should not be necessary
since you have a get_cycles() based delay loop.
Just set loops_per_jiffy to the correct value based
on the frequency of the cycle counter, to save
a little time during boot and get a more accurate
delay loop.
Arnd