Re: [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers

From: Mark Brown
Date: Fri Aug 04 2023 - 12:37:45 EST


On Fri, Aug 04, 2023 at 05:20:21PM +0100, Catalin Marinas wrote:
> On Thu, Aug 03, 2023 at 06:44:24PM +0100, Mark Brown wrote:
> > On Thu, Aug 03, 2023 at 05:39:38PM +0100, Catalin Marinas wrote:

> > > Maybe that's the simplest fix, especially if you want it in stable, but

> > Yeah, it's definitely the sort of change we want as a fix - anything
> > more invasive would be inappropriate.

> I'd say it's still ok if we can just rip come code out safely (the fake
> ID reg).

It's the safely bit that concerns me here - it feels like a great way to
discover why the code was there, possibly including a use that was there
in the past but has subsequently been removed so bites a stable version.

> > Both enumeration mechanisms were added in the initial series supporting
> > SVE for reasons that are not entirely obvious to me. The changelogs
> > explain what we're doing with the pseudo ID register stuff but do not
> > comment on why. There is a cross check between the answers the two give
> > which appears to be geared towards detecting systems with asymmetric
> > maximum VLs for some reason but I'm not sure why that's done given that
> > we can't cope if *any* VL in the committed set is missing, not just the
> > maximum.

> We can cope with different VLs if the committed map is built during boot
> (early secondary CPU bring-up). For any late/hotplugged CPUs, if they
> don't fit the map, they'll be rejected. Not sure where the actual
> maximum length matters in this process though (or later for user space).
> I assume the user will only be allowed to set the common VLs across all
> the early CPUs.

Indeed, since we need to check each VL in the set we expose to userspace
individually that will include the maximum VL which should mean that
having a separate check for the maximum VL is redundant. That's always
been the case though which makes me worried about changing it for a fix
rather than a cleanup.

For KVM we need the stricter requirement that no additional VLs are
supported in the subset that KVM exports to clients since guests can
directly enumerate VLs from the hardware and we don't want the answer
changing depending on what physical CPU we schedule a vCPU on. That
should similarly not need a distinct check for the maximum VL.

> > The whole thing is very suspect but given that we don't currently have
> > any ability to emulate systems with asymmetric vector lengths I'm a bit
> > reluctant to poke at it.

> The Arm fast models should allow such configuration, though I haven't
> tried.

They don't, SVE and SME are provided as a plugin and all their
configuration is done at the plugin level so there's no per PE or per
cluster options like there are for features implemented in the model
itself.

Attachment: signature.asc
Description: PGP signature