Re: [PATCH v6 1/1] x86: kvm: svm: set up ERAPS support for guests

From: Andrew Cooper
Date: Mon Nov 24 2025 - 11:43:18 EST

Next message: Vincent Guittot: "Re: stable 6.6: commit "sched/cpufreq: Rework schedutil governor performance estimation' causes a regression"
Previous message: Greg KH: "Re: [PATCH V4 4/4] gpib: Destage gpib"
In reply to: Shah, Amit: "Re: [PATCH v6 1/1] x86: kvm: svm: set up ERAPS support for guests"
Next in thread: Shah, Amit: "Re: [PATCH v6 1/1] x86: kvm: svm: set up ERAPS support for guests"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 24/11/2025 4:15 pm, Shah, Amit wrote:
> On Thu, 2025-11-20 at 12:11 -0800, Sean Christopherson wrote:
>>> 2. Hosts that disable NPT: the ERAPS feature flushes the RSB
>>> entries on
>>>    several conditions, including CR3 updates. Emulating hardware
>>>    behaviour on RSB flushes is not worth the effort for NPT=off
>>> case,
>>>    nor is it worthwhile to enumerate and emulate every trigger the
>>>    hardware uses to flush RSB entries. Instead of identifying and
>>>    replicating RSB flushes that hardware would have performed had
>>> NPT
>>>    been ON, do not let NPT=off VMs use the ERAPS features.
>> The emulation requirements are not limited to shadow paging. From
>> the APM:
>>
>> The ERAPS feature eliminates the need to execute CALL instructions
>> to clear
>> the return address predictor in most cases. On processors that
>> support ERAPS,
>> return addresses from CALL instructions executed in host mode are
>> not used in
>> guest mode, and vice versa. Additionally, the return address
>> predictor is
>> cleared in all cases when the TLB is implicitly invalidated (see
>> Section 5.5.3 “TLB
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> Management,” on page 159) and in the following cases:
>>
>> • MOV CR3 instruction
>> • INVPCID other than single address invalidation (operation type 0)
>>
>> Yes, KVM only intercepts MOV CR3 and INVPCID when NPT is disabled (or
>> INVPCID is
>> unsupported per guest CPUID), but that is an implementation detail,
>> the instructions
>> are still reachable via emulator, and KVM needs to emulate implicit
>> TLB flush
>> behavior.
>>
>> So punting on emulating RAP clearing because it's too hard is not an
>> option. And
>> AFAICT, it's not even that hard.
> I didn't mean on punting it in the "it's too hard" sense, but in the
> sense that we don't know all the details of when hardware decides to do
> a flush; and even if triggers are mentioned in this APM today, future
> changes to microcode or APM docs might reveal more triggers that we
> need to emulate and account for. There's no way to track such changes,
> so my thinking is that we should be conservative and not assume
> anything.

But this *is* the problem. The APM says that OSes can depend on this
property for safety, and does not provide enough information for
Hypervisors to make it safe.

ERAPS is a bad spec. It should not have gotten out of the door.

A better spec would say "clears the RAP on any MOV to CR3" and nothing else.

The fact that it might happen microarchitecturally in other cases
doesn't matter; what matters is what OSes can architecturally depend on,
and right now that that explicitly includes "unspecified cases in NDA
documents".

~Andrew

Next message: Vincent Guittot: "Re: stable 6.6: commit "sched/cpufreq: Rework schedutil governor performance estimation' causes a regression"
Previous message: Greg KH: "Re: [PATCH V4 4/4] gpib: Destage gpib"
In reply to: Shah, Amit: "Re: [PATCH v6 1/1] x86: kvm: svm: set up ERAPS support for guests"
Next in thread: Shah, Amit: "Re: [PATCH v6 1/1] x86: kvm: svm: set up ERAPS support for guests"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]