Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

From: Anup Patel
Date: Fri Apr 08 2022 - 12:45:44 EST


On Fri, Apr 8, 2022 at 10:08 PM Heinrich Schuchardt
<heinrich.schuchardt@xxxxxxxxxxxxx> wrote:
>
> On 4/6/22 12:10, Anup Patel wrote:
> > On Wed, Apr 6, 2022 at 3:25 PM Heinrich Schuchardt
> > <heinrich.schuchardt@xxxxxxxxxxxxx> wrote:
> >>
> >> On 3/31/22 21:42, Palmer Dabbelt wrote:
> >>> On Sat, 19 Mar 2022 05:12:06 PDT (-0700), apatel@xxxxxxxxxxxxxxxx wrote:
> >>>> Currently, the range and default value of NR_CPUS is too restrictive
> >>>> for high-end RISC-V systems with large number of HARTs. The latest
> >>>> QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
> >>>> restrictive for QEMU as well. Other major architectures (such as
> >>>> ARM64, x86_64, MIPS, etc) have a much higher range and default
> >>>> value of NR_CPUS.
> >>>>
> >>>> This patch increases NR_CPUS range to 2-512 and default value to
> >>>> XLEN (i.e. 32 for RV32 and 64 for RV64).
> >>>>
> >>>> Signed-off-by: Anup Patel <apatel@xxxxxxxxxxxxxxxx>
> >>>> ---
> >>>> Changes since v1:
> >>>> - Updated NR_CPUS range to 2-512 which reflects maximum number of
> >>>> CPUs supported by QEMU virt machine.
> >>>> ---
> >>>> arch/riscv/Kconfig | 7 ++++---
> >>>> 1 file changed, 4 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> >>>> index 5adcbd9b5e88..423ac17f598c 100644
> >>>> --- a/arch/riscv/Kconfig
> >>>> +++ b/arch/riscv/Kconfig
> >>>> @@ -274,10 +274,11 @@ config SMP
> >>>> If you don't know what to do here, say N.
> >>>>
> >>>> config NR_CPUS
> >>>> - int "Maximum number of CPUs (2-32)"
> >>>> - range 2 32
> >>>> + int "Maximum number of CPUs (2-512)"
> >>>> + range 2 512
> >>
> >> For SBI_V01=y there seems to be a hard constraint to XLEN bits.
> >> See __sbi_v01_cpumask_to_hartmask() in rch/riscv/kernel/sbi.c.
> >>
> >> So shouldn't this be something like:
> >>
> >> range 2 512 !SBI_V01
> >> range 2 32 SBI_V01 && 32BIT
> >> range 2 64 SBI_V01 && 64BIT
> >
> > This is just making it unnecessarily complicated for supporting
> > SBI v0.1
> >
> > How about removing SBI v0.1 support and the spin-wait CPU
> > operations from arch/riscv ?
>
> The SBI v0.1 specification was only a draft. Only the v1.0 version has
> ever been ratified.
>
> It would be good to remove this legacy code from Linux and U-Boot.
>
> By the way, why does upstream OpenSBI claim to be conformant to SBI v0.3
> and not to v1.0?

The ratification process for SBI v1.0 was in early stages when OpenSBI v1.0
was being released so we decided to keep the SBI v0.3 spec version. The
next OpenSBI v1.1 release (due in June 2022) will change to SBI v1.0

Regards,
Anup

>
> include/sbi/sbi_ecall.h:16:
>
> #define SBI_ECALL_VERSION_MAJOR 0
> #define SBI_ECALL_VERSION_MINOR 3
>
> Best regards
>
> Heinrich
>
> >
> >>
> >>>> depends on SMP
> >>>> - default "8"
> >>>> + default "32" if 32BIT
> >>>> + default "64" if 64BIT
> >>>>
> >>>> config HOTPLUG_CPU
> >>>> bool "Support for hot-pluggable CPUs"
> >>>
> >>> I'm getting all sorts of boot issues with more than 32 CPUs, even on the
> >>> latest QEMU master. I'm not opposed to increasing the CPU count in
> >>> theory, but if we're going to have a setting that goes up to a huge
> >>> number it needs to at least boot. I've got 64 host threads, so it
> >>> shouldn't just be a scheduling thing.
> >>
> >> Currently high performing hardware for RISC-V is missing. So it makes
> >> sense to build software via QEMU on x86_64 or arm64 with as many
> >> hardware threads as available (128 is not uncommon).
> >>
> >> OpenSBI currently is limited to 128 threads:
> >> include/sbi/sbi_hartmask.h:22:
> >> #define SBI_HARTMASK_MAX_BITS 128
> >> This is just an arbitrary value we can be modified.
> >
> > Yes, this limit will be gradually increased with some improvements
> > to optimize runtime memory used by OpenSBI.
> >
> >>
> >> U-Boot v2022.04 qemu-riscv64_smode_defconfig has a value of
> >> CONFIG_SYS_MALLOC_F_LEN that is to low. This leads to a boot failure for
> >> more than 16 harts. A patch to correct this is pending:
> >> [PATCH v2 1/1] riscv: alloc space exhausted
> >> https://lore.kernel.org/u-boot/CAN5B=eKt=tFLZ2z3aNHJqsnJzpdA0oikcrC2i1_=ZDD=f+M0jA@xxxxxxxxxxxxxx/T/#t
> >>
> >> With QEMU 7.0 and the U-Boot fix booting into a 5.17 defconfig kernel
> >> with 64 virtual cores worked fine for me.
> >
> > Thanks for trying this patch.
> >
> > Regards,
> > Anup
> >
> >>
> >> Best regards
> >>
> >> Heinrich
> >>
> >>>
> >>> If there was some hardware that actually boots on these I'd be happy to
> >>> take it, but given that it's just QEMU I'd prefer to sort out the bugs
> >>> first. It's probably just latent bugs somewhere, but allowing users to
> >>> turn on configs we know don't work just seems like the wrong way to go.
> >>>