Re: [BUG] arm64/m1: Accessing SYS_ID_AA64ISAR2_EL1 causes early boot failure on 5.15.28, 5.16.14, 5.17

From: Marc Zyngier
Date: Mon Mar 14 2022 - 06:34:14 EST


{switching email address]

On 2022-03-14 10:03, A. Wilcox wrote:
On Mar 14, 2022, at 4:08 AM, Marc Zyngier <maz@xxxxxxxxxxxxxxx> wrote:
On 2022-03-14 06:35, Greg KH wrote:
On Sun, Mar 13, 2022 at 10:59:01PM -0500, A. Wilcox wrote:
Hello,
I’ve been testing kernel updates for the Adélie Linux distribution’s
ARM64 port using a Parallels VM on a MacBook Pro (13-inch, M1, 2020).
When the kernel attempts to access SYS_ID_AA64ISAR2_EL1, it causes a
fault as seen here booting 5.17.0-rc8:

[…]

This is because detection of the clearbhb instruction support requires
accessing SYS_ID_AA64ISAR2_EL1. Commenting out the two uses of
supports_clearbhb in the kernel now yields a successful boot.
Qemu developers seem to have found this issue as well[1] when trying to
boot 5.17 using HVF, the Apple Hypervisor Framework. This seems to be
some sort of platform quirk on M1, or at least in HVF on M1. I’m not
sure what the best workaround would be for this. SYS_ID_AA64ISAR2_EL1
seems to be something added in ARMv8.7, so perhaps access to it could be
gated on that.
Unfortunately, this code was just added to 5.15.28 and 5.16.14, so
stable no longer boots on Parallels VM on M1. I am unsure if this
affects physical boot on Apple M1 or not.
What commit causes this problem? It sounds like you narrowed this down
already, right?

This really is a Parallels bug. These kernels run fine on bare metal
M1 and in KVM. QEMU was affected as well, and that was fixed in their
HVF handling. HVF itself is fine.

So this should be punted back to the hypervisor vendor for not properly
implementing the architecture (no ID register is allowed to UNDEF).

Thanks, I wasn’t able to test native boot. Since this is a bug in the
hypervisor, I’ll notify them in the morning.

Great, thanks.

For those of us stuck with Parallels, I’ll assume reverting of these
three commits in my own build is the best way forward until it’s
fixed. The M1 isn’t going to grow new instruction support in the
meantime, so I don’t see a whole lot of harm in it - but the other
mitigations in .28 seem useful.

As a *very* short term solution, that's probably the right thing to do.

However, this register is bound to grow new uses over time, and disabling
these features in a distro kernel is going to impact all users, unless
your particular kernel build is strictly limited to M1.

Thanks,

M.
--
Jazz is not dead. It just smells funny...