Re: [PATCH] arm64: Add overrride for MPAM
From: Xi Ruoyao
Date: Tue Apr 01 2025 - 08:35:08 EST
On Tue, 2025-04-01 at 13:09 +0100, Marc Zyngier wrote:
> On Tue, 01 Apr 2025 12:47:03 +0100,
> Xi Ruoyao <xry111@xxxxxxxxxxx> wrote:
> >
> > On Tue, 2025-04-01 at 14:04 +0530, Anshuman Khandual wrote:
> > > On 4/1/25 11:26, Xi Ruoyao wrote:
> > > > As the message of the commit 09e6b306f3ba ("arm64: cpufeature: discover
> > > > CPU support for MPAM") already states, if a buggy firmware fails to
> > > > either enable MPAM or emulate the trap as if it were disabled, the
> > > > kernel will just fail to boot. While upgrading the firmware should be
> > > > the best solution, we have some hardware of which the vender have made
> > > > no response 2 months after we requested a firmware update. Allow
> > > > overriding it so our devices don't become some e-waste.
> > >
> > > There could be similar problems, where firmware might not enable arch
> > > features as required. Just wondering if there is a platform policy in
> > > place for enabling id-reg overrides for working around such scenarios
> > > to prevent a kernel crash etc ?
> >
> > In https://lore.kernel.org/all/87jzcfsuep.wl-maz@xxxxxxxxxx/:
> >
> > > For such cases, when MPAM is incorrectly advertised, can we have kernel
> > > command line parameter like mpam=0 to override it's detection?
> >
> > We could, but only when we can confirm what the problem is.
> >
> > And there was prior arts like:
> >
> > commit 892f7237b3ffb090f1b1f1e55fe7c50664405aed
> > Author: Marc Zyngier <maz@xxxxxxxxxx>
> > Date: Wed Jul 20 11:52:19 2022 +0100
> >
> > arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}
> >
> > Even if we are now able to tell the kernel to avoid exposing SVE/SME
> > from the command line, we still have a couple of places where we
> > unconditionally access the ZCR_EL1 (resp. SMCR_EL1) registers.
> >
> > On systems with broken firmwares, this results in a crash even if
> > arm64.nosve (resp. arm64.nosme) was passed on the command-line.
> >
> > To avoid this, only update cpuinfo_arm64::reg_{zcr,smcr} once
> > we have computed the sanitised version for the corresponding
> > feature registers (ID_AA64PFR0 for SVE, and ID_AA64PFR1 for
> > SME). This results in some minor refactoring.
>
> That particular patch has caused quite a few issues, see d3c7c48d004f.
> So don't use it as a reference.
>
> Now, while I think an option is probably acceptable in the face of an
> unresponsive vendor, I don't think the way you implement it is the
> correct approach.
>
> It should be possible to handle the override in the assembly code,
> like we do for other bits and pieces, and deal with MPAMIDR_EL1 later
> down the line, once the sanitised ID registers are known to be valid.
Ok I'll try it.
--
Xi Ruoyao <xry111@xxxxxxxxxxx>
School of Aerospace Science and Technology, Xidian University