Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability

From: Ard Biesheuvel
Date: Tue Oct 15 2019 - 06:30:32 EST


On Tue, 15 Oct 2019 at 12:25, Dave Martin <Dave.Martin@xxxxxxx> wrote:
>
> On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:
> > On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@xxxxxxx> wrote:
> > >
> > > On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
> > > >
> > > >
> > > > On 14/10/2019 15:52, Dave Martin wrote:
> > > > > On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
> > > > >>
> > > > >>
> > > > >> On 11/10/2019 15:21, Dave Martin wrote:
> > > > >>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
> > > > >>>>
> > > > >>>> On 11/10/2019 12:36, Dave Martin wrote:
> > > > >>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> > > > >>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
> > > > >>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
> > > > >>>>>> after all the SMP CPUs are brought up. However, we use the status
> > > > >>>>>> of this capability for every context switch. So, let us change
> > > > >>>>>> the scop to LOCAL_CPU to allow the detection of this capability
> > > > >>>>>> as and when the first CPU without FP is brought up.
> > > > >>>>>>
> > > > >>>>>> Also, the current type allows hotplugged CPU to be brought up without
> > > > >>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
> > > > >>>>>> up. Fix both of these issues by changing the capability to
> > > > >>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
> > > > >>>>>>
> > > > >>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> > > > >>>>>> Cc: Will Deacon <will@xxxxxxxxxx>
> > > > >>>>>> Cc: Mark Rutland <mark.rutland@xxxxxxx>
> > > > >>>>>> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> > > > >>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> > > > >>>>>> ---
> > > > >>>>>> arch/arm64/kernel/cpufeature.c | 2 +-
> > > > >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >>>>>>
> > > > >>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > > >>>>>> index 9323bcc40a58..0f9eace6c64b 100644
> > > > >>>>>> --- a/arch/arm64/kernel/cpufeature.c
> > > > >>>>>> +++ b/arch/arm64/kernel/cpufeature.c
> > > > >>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> > > > >>>>>> {
> > > > >>>>>> /* FP/SIMD is not implemented */
> > > > >>>>>> .capability = ARM64_HAS_NO_FPSIMD,
> > > > >>>>>> - .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> > > > >>>>>> + .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
> > > > >>>>>
> > > > >>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
> > > > >>>>>
> > > > >>>>> Although we have other things that smell like this (CPU errata for
> > > > >>>>> example), I wonder whether inverting the meaning in the case would
> > > > >>>>> make the situation easier to understand.
> > > > >>>>
> > > > >>>> Yes, it is indeed a disability, more on that below.
> > > > >>>>
> > > > >>>>>
> > > > >>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
> > > > >>>>> value of 0. Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
> > > > >>>>> IIUC. We'd just need to invert the sense of the check in
> > > > >>>>> system_supports_fpsimd().
> > > > >>>>
> > > > >>>> This is particularly something we want to avoid with this patch. We want
> > > > >>>> to make sure that we have the up-to-date status of the disability right
> > > > >>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
> > > > >>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
> > > > >>>> you must wait until all the CPUs are up, unlike the negated capability.
> > > > >>>
> > > > >>> I don't see why waiting for the random defective early CPU to come up is
> > > > >>> better than waiting for all the early CPUs to come up and then deciding.
> > > > >>>
> > > > >>> Kernel-mode NEON aside, the status of this cap should not matter until
> > > > >>> we enter userspace for the first time.
> > > > >>>
> > > > >>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
> > > > >>> probe for it before all early CPUs are up, and so cache the wrong
> > > > >>> decision. The current approach doesn't cope with that anyway AFAICT.
> > > > >>
> > > > >> This approach does in fact. With LOCAL_CPU scope, the moment a defective
> > > > >> CPU turns up, we mark the "capability" and thus the kernel cannot use
> > > > >> the neon then onwards, unlike the existing case where we have time till
> > > > >> we boot all the CPUs (even when the boot CPU may be defective).
> > > > >
> > > > > I guess that makes sense.
> > > > >
> > > > > I'm now wondering what happens if anything tries to use kernel-mode NEON
> > > > > before SVE is initialised -- which doesn't happen until cpufeatures
> > > > > configures the system features.
> > > > >
> > > > > I don't think your proposed change makes anything worse here, but it may
> > > > > need looking into.
> > > >
> > > > We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
> > > > is initialised ?
> > >
> > > Could do, at least as an experiment.
> > >
> > > Ard, do you have any thoughts on this?
> > >
> >
> > All in-kernel NEON code checks whether the NEON is usable, so I'd
> > expect that check to return 'false' if it is too early in the boot for
> > the NEON to be used at all.
>
> My concern is that the check may be done once, at probe time, for crypto
> drivers. If probing happens before system_supports_fpsimd() has
> stabilised, we may be stuck with the wrong probe decision.
>
> So: are crypto drivers and kernel_mode_neon() users definitely only
> probed _after_ all early CPUs are up?
>

Isn't SMP already up when initcalls are processed?