Re: [PATCH v4 2/5] irqchip, gicv3: Workaround for Cavium ThunderX erratum 23154

From: Catalin Marinas
Date: Tue Sep 08 2015 - 05:37:38 EST


On Tue, Sep 08, 2015 at 10:09:30AM +0100, Suzuki K. Poulose wrote:
> On 08/09/15 10:00, Catalin Marinas wrote:
> >On Mon, Sep 07, 2015 at 06:41:50PM +0100, Suzuki K. Poulose wrote:
> >>On 07/09/15 18:15, Catalin Marinas wrote:
> >>>On Mon, Sep 07, 2015 at 05:54:06PM +0100, Suzuki K. Poulose wrote:
> >>>>On 14/08/15 19:28, Robert Richter wrote:
> >>>>>+static void gicv3_enable_quirks(void)
> >>>>>+{
> >>>>>+ if (cpus_have_cap(ARM64_WORKAROUND_CAVIUM_23154))
> >>>>>+ static_key_slow_inc(&is_cavium_thunderx);
> >>>>
> >>>>May be you could use the enable() method added to struct arm64_cpu_capability
> >>>>here to perform the above operation, added by James :
> >>>>
> >>>>commit 1c0763037f1e1caef739e36e09c6d41ed7b61b2d
> >>>>Author: James Morse <james.morse@xxxxxxx>
> >>>>Date: Tue Jul 21 13:23:28 2015 +0100
> >>>>
> >>>> arm64: kernel: Add cpufeature 'enable' callback
> >>>
> >>>I thought about this as well when looking at the patch but decided it's
> >>>better as it is. The "enable" method is meant to enable per-CPU features
> >>>(or workarounds) but here it is about GICv3, so we don't want to enable
> >>>for every CPU.
> >>
> >>Right. I have been playing with a series where the checks are delayed until
> >>all CPUs are brought up.
> >
> >Unrelated to the GIC workaround, delaying the enable feature until the
> >CPUs are brought up is not always be feasible.
>
> Right. But then, enabling a feature(and applying the alternatives) based on
> a single CPU may not be safe, always, like PAN. If one of the boot time CPU
> doesn't have it, then we are in trouble (even though we WARN about it from
> SANITY check)

I see your point but there's a trade-off. For some features it's not be
feasible to postpone until user space (e.g. errata workarounds). But if
a CPU coming up late doesn't have compatible features, just keep it in a
loop (or park it back if possible or even refuse to boot any further). I
don't think we should cater for insane hardware configurations (e.g. mix
of PAN/no-PAN as we already do the code patching). Do you plan to defer
code patching as well?

Note that we may have to use the .enable function for errata workarounds
as well, not just features like PAN (we currently only do code patching
but we may have to do other things like issuing SMC calls, you never
know what's going to hit us).

> >At some point we may
> >implement support to defer the CPU on to user space (I already have a
> >patch that does this when no DT enable-method is specified, but I won't
> >publish it before Qualcomm fixes its firmware ;)). But we may have other
> >reasons to start with CPUs hot-unplugged by default and turn them on
> >later.
>
> We have SANITY check infrastructure that WARNs in such cases, if the features
> don't match. But still, wouldn't it be better to enable a feature
> only if all the boot-time enabled CPUs have it ? (Errata is an exception though,
> which only depends on whether one of the CPU needs it).

If we ever need this, I think we should implement a separate late_enable
function as just deferring all features enabling is not generic enough.
But in the meantime, I don't think we should worry about this case,
let's wait and see whether we ever get such configurations (panicking
the kernel on incompatible features is a good starting point -
FPSIMD/no-FPSIMD, PAN/no-PAN etc.)

--
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/