Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

From: David Abdurachmanov
Date: Fri Feb 08 2019 - 23:32:30 EST


On Sat, Feb 9, 2019 at 12:03 AM Atish Patra <atish.patra@xxxxxxx> wrote:
>
> On 2/8/19 1:11 AM, Christoph Hellwig wrote:
> >> + * We don't support running Linux on hertergenous ISA systems.
> >> + * But first "okay" processor might not be the boot cpu.
> >> + * Check the ISA of boot cpu.
> >
> > Please use up your available 80 characters per line in comments.
> >
> I will fix it.
>
> >> + /*
> >> + * All "okay" hart should have same isa. We don't know how to
> >> + * handle if they don't. Throw a warning for now.
> >> + */
> >> + if (elf_hwcap && temp_hwcap != elf_hwcap)
> >> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
> >> + elf_hwcap, temp_hwcap);
> >> +
> >> + if (hartid == boot_cpu_hartid)
> >> + boot_hwcap = temp_hwcap;
> >> + elf_hwcap = temp_hwcap;
> >
> > So we always set elf_hwcap to the capabilities of the previous cpu.
> >
> >> + temp_hwcap = 0;
> >
> > I think tmp_hwcap should be declared and initialized inside the outer loop
> > instead having to manually reset it like this.
> >
> >> + }
> >>
> >> + elf_hwcap = boot_hwcap;
> >
> > And then reset it here to the boot cpu.
> >
> > Shoudn't we only report the features supported by all cores? Otherwise
> > we'll still have problems if the boot cpu supports a feature, but not
> > others.
> >
>
> Hmm. The other side of the argument is boot cpu does have a feature that
> is not supported by other hart that didn't even boot.
> The user space may execute something based on boot cpu capability but
> that won't be enabled.
>
> At least, in this way we know that we are compatible completely with
> boot cpu capabilities. Thoughts ?

There is one example on the market, e.g., Samsung Exynos 9810.

Mongoose 3 (big cores) only support ARMv8.0, while Cortex-A55
(little ones) support ARMv8.2 (and that brings atomics support).
I think, it's the only ARM SOC that supports different ISA extensions
between cores on the same package.

Kernel scheduler doesn't know that big cores are missing atomics
support or that applications needs it and moves the thread
resulting in illegal instruction.

E.g., see Golang issue: https://github.com/golang/go/issues/28431

I also recall Jon Masters (Computer Architect at Red Hat) advocating
against having cores with mismatched capabilities on the server market.

It just causes more problems down the line.

david