Re: [RESEND RFC PATCH v1 1/5] arm64: Add TLB Conflict Abort Exception handler to KVM

From: Ryan Roberts
Date: Thu Dec 12 2024 - 04:25:15 EST


On 11/12/2024 17:40, Marc Zyngier wrote:
> On Wed, 11 Dec 2024 16:01:37 +0000,
> Mikołaj Lenczewski <miko.lenczewski@xxxxxxx> wrote:
>>
>> Currently, KVM does not handle the case of a stage 2 TLB conflict abort
>> exception. The Arm ARM specifies that the worst-case handling of such an
>> exception requires a `tlbi vmalls12e1`.
>
> Not quite. It says (I_JCCRT):
>
> <quote>
> * For the EL1&0 translation regime, when stage 2 translations are in
> use, either VMALLS12E1 or ALLE1.
> </quote>
>
>> Perform such an invalidation when this exception is encountered.
>
> What you fail to describe is *why* this is needed. You know it, I know
> it, but not everybody does. A reference to the ARM ARM would
> definitely be helpful.
>
>>
>> Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@xxxxxxx>
>> ---
>> arch/arm64/include/asm/esr.h | 8 ++++++++
>> arch/arm64/kvm/mmu.c | 6 ++++++
>> 2 files changed, 14 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
>> index d1b1a33f9a8b..8a66f81ca291 100644
>> --- a/arch/arm64/include/asm/esr.h
>> +++ b/arch/arm64/include/asm/esr.h
>> @@ -121,6 +121,7 @@
>> #define ESR_ELx_FSC_SEA_TTW(n) (0x14 + (n))
>> #define ESR_ELx_FSC_SECC (0x18)
>> #define ESR_ELx_FSC_SECC_TTW(n) (0x1c + (n))
>> +#define ESR_ELx_FSC_TLBABT (0x30)
>>
>> /* Status codes for individual page table levels */
>> #define ESR_ELx_FSC_ACCESS_L(n) (ESR_ELx_FSC_ACCESS + (n))
>> @@ -464,6 +465,13 @@ static inline bool esr_fsc_is_access_flag_fault(unsigned long esr)
>> (esr == ESR_ELx_FSC_ACCESS_L(0));
>> }
>>
>> +static inline bool esr_fsc_is_tlb_conflict_abort(unsigned long esr)
>> +{
>> + esr = esr & ESR_ELx_FSC;
>> +
>> + return esr == ESR_ELx_FSC_TLBABT;
>> +}
>> +
>> /* Indicate whether ESR.EC==0x1A is for an ERETAx instruction */
>> static inline bool esr_iss_is_eretax(unsigned long esr)
>> {
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index c9d46ad57e52..c8c6f5a97a1b 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -1756,6 +1756,12 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>> ipa = fault_ipa = kvm_vcpu_get_fault_ipa(vcpu);
>> is_iabt = kvm_vcpu_trap_is_iabt(vcpu);
>>
>> + if (esr_fsc_is_tlb_conflict_abort(esr)) {
>> + // does a `tlbi vmalls12e1is`
>
> nit: this isn't a very useful comment.
>
>> + __kvm_tlb_flush_vmid(&vcpu->kvm->arch.mmu);
>> + return 1;
>> + }
>
> That's not enough, unfortunately. A nested VM has *many* VMIDs (the
> flattening of all translation contexts that the guest uses).
>
> So you can either iterate over all the valid VMIDs owned by this
> guest, or more simply issue a TLBI ALLE1, which will do the trick in a
> much more efficient way.
>
> The other thing is that you are using an IS invalidation, which is
> farther reaching than necessary. Why would you invalidate the TLBs for
> CPUs that are only innocent bystanders? A non-shareable invalidation
> seems preferable to me.
>
>> +
>> if (esr_fsc_is_translation_fault(esr)) {
>> /* Beyond sanitised PARange (which is the IPA limit) */
>> if (fault_ipa >= BIT_ULL(get_kvm_ipa_limit())) {
>
> But it also begs the question: why only KVM, and not the host? This
> handler will only take effect for a TLB Conflict abort delivered from
> an EL1 guest to EL2.

Hi Marc,

I believe the intent of this patch is to protect the host/KVM against a guest
that is using BBML2. The host/KVM always assumes BBML0 and therefore doesn't do
any operations that are allowed by the arch to cause a conflict abort. Therefore
the host doesn't need to handle it. But a guest could be taking advantage of
BBML2 and therefore it's architiecturally possible for a conflict abort to be
raised to EL2. I think today that would take down the host?

So really I think this could be considered a stand-alone KVM hardening improvement?

>
> However, it doesn't seem to me that the host is equipped to deal with
> this sort of exception for itself. Shouldn't you start with that?

If the host isn't doing any BBML2 operations it doesn't need to handle it, I
don't think? Obviously that changes later in the series and Miko is adding the
required handling to the host.

Thanks,
Ryan

>
> Thanks,
>
> M.
>