Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability

From: Ard Biesheuvel
Date: Tue Oct 15 2019 - 05:52:28 EST


On Tue, 15 Oct 2019 at 11:44, Suzuki K Poulose <suzuki.poulose@xxxxxxx> wrote:
>
>
>
> On 14/10/2019 17:57, Ard Biesheuvel wrote:
> > On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@xxxxxxx> wrote:
> >>
> >> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
> >>>
> >>>
> >>> On 14/10/2019 15:52, Dave Martin wrote:
> >>>> On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
> >>>>>
> >>>>>
> >>>>> On 11/10/2019 15:21, Dave Martin wrote:
> >>>>>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
> >>>>>>>
> >>>>>>> On 11/10/2019 12:36, Dave Martin wrote:
> >>>>>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> >>>>>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
> >>>>>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
> >>>>>>>>> after all the SMP CPUs are brought up. However, we use the status
> >>>>>>>>> of this capability for every context switch. So, let us change
> >>>>>>>>> the scop to LOCAL_CPU to allow the detection of this capability
> >>>>>>>>> as and when the first CPU without FP is brought up.
> >>>>>>>>>
> >>>>>>>>> Also, the current type allows hotplugged CPU to be brought up without
> >>>>>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
> >>>>>>>>> up. Fix both of these issues by changing the capability to
> >>>>>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
> >>>>>>>>>
> >>>>>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> >>>>>>>>> Cc: Will Deacon <will@xxxxxxxxxx>
> >>>>>>>>> Cc: Mark Rutland <mark.rutland@xxxxxxx>
> >>>>>>>>> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> >>>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> >>>>>>>>> ---
> >>>>>>>>> arch/arm64/kernel/cpufeature.c | 2 +-
> >>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> >>>>>>>>> index 9323bcc40a58..0f9eace6c64b 100644
> >>>>>>>>> --- a/arch/arm64/kernel/cpufeature.c
> >>>>>>>>> +++ b/arch/arm64/kernel/cpufeature.c
> >>>>>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> >>>>>>>>> {
> >>>>>>>>> /* FP/SIMD is not implemented */
> >>>>>>>>> .capability = ARM64_HAS_NO_FPSIMD,
> >>>>>>>>> - .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> >>>>>>>>> + .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
> >>>>>>>>
> >>>>>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
> >>>>>>>>
> >>>>>>>> Although we have other things that smell like this (CPU errata for
> >>>>>>>> example), I wonder whether inverting the meaning in the case would
> >>>>>>>> make the situation easier to understand.
> >>>>>>>
> >>>>>>> Yes, it is indeed a disability, more on that below.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
> >>>>>>>> value of 0. Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
> >>>>>>>> IIUC. We'd just need to invert the sense of the check in
> >>>>>>>> system_supports_fpsimd().
> >>>>>>>
> >>>>>>> This is particularly something we want to avoid with this patch. We want
> >>>>>>> to make sure that we have the up-to-date status of the disability right
> >>>>>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
> >>>>>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
> >>>>>>> you must wait until all the CPUs are up, unlike the negated capability.
> >>>>>>
> >>>>>> I don't see why waiting for the random defective early CPU to come up is
> >>>>>> better than waiting for all the early CPUs to come up and then deciding.
> >>>>>>
> >>>>>> Kernel-mode NEON aside, the status of this cap should not matter until
> >>>>>> we enter userspace for the first time.
> >>>>>>
> >>>>>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
> >>>>>> probe for it before all early CPUs are up, and so cache the wrong
> >>>>>> decision. The current approach doesn't cope with that anyway AFAICT.
> >>>>>
> >>>>> This approach does in fact. With LOCAL_CPU scope, the moment a defective
> >>>>> CPU turns up, we mark the "capability" and thus the kernel cannot use
> >>>>> the neon then onwards, unlike the existing case where we have time till
> >>>>> we boot all the CPUs (even when the boot CPU may be defective).
> >>>>
> >>>> I guess that makes sense.
> >>>>
> >>>> I'm now wondering what happens if anything tries to use kernel-mode NEON
> >>>> before SVE is initialised -- which doesn't happen until cpufeatures
> >>>> configures the system features.
> >>>>
> >>>> I don't think your proposed change makes anything worse here, but it may
> >>>> need looking into.
> >>>
> >>> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
> >>> is initialised ?
> >>
> >> Could do, at least as an experiment.
> >>
> >> Ard, do you have any thoughts on this?
> >>
> >
> > All in-kernel NEON code checks whether the NEON is usable, so I'd
> > expect that check to return 'false' if it is too early in the boot for
> > the NEON to be used at all.
>
> Ok. That implies, we need a check to make sure SVE set up is complete,
> which we don't at the moment, as we default to assume FP/SIMD is available.
>
> "system_can_use_fpsimd()" instead of the "system_supports_fpsimd() where
> the former should indicate:
>
> system_supports_fpsimd() && sve_setup_complete()
>
> Where the sve_setup_complete() can itself be a static key, initialized
> very early if we have !CONFIG_SVE. Otherwise, set from sve_setup().
>
>
> Thoughts ?

Yes, that sounds reasonable. If we fold that into the implementation
of may_use_simd(), we shouldn't need any other changes to the clients
AFAICT