Re: [PATCH 1/2] x86: cpu/bugs: add support for AMD ERAPS feature

From: Shah, Amit
Date: Mon Nov 04 2024 - 12:22:57 EST


On Mon, 2024-11-04 at 08:26 -0800, Dave Hansen wrote:
> On 11/4/24 08:13, Shah, Amit wrote:
> > I want to justify that not setting X86_FEATURE_RSB_CTXSW is still
> > doing
> > the right thing, albeit in hardware.
>
> Let's back up a bit.
>
> In the kernel, we have security concerns if RSB contents remain
> across
> context switches.  If process A's RSB entries are left and then
> process
> B uses them, there's a problem.
>
> Today, we mitigate that issue with manual kernel RSB state zapping on
> context switches (X86_FEATURE_RSB_CTXSW).
>
> You're saying that this fancy new ERAPS feature includes a new
> mechanism
> to zap RSB state.  But that only triggers "each time a TLB flush
> happens".
>
> So what you're saying above is that you are concerned about RSB
> contents
> sticking around across context switches.  But instead of using
> X86_FEATURE_RSB_CTXSW, you believe that the new TLB-flush-triggered
> ERAPS flush can be used instead.
>
> Are we all on the same page so far?

All good so far.

> I think you're wrong.  We can't depend on ERAPS for this.  Linux
> doesn't
> flush the TLB on context switches when PCIDs are in play.  Thus,
> ERAPS
> won't flush the RSB and will leave bad state in there and will leave
> the
> system vulnerable.
>
> Or what am I missing?

I just received confirmation from our hardware engineers on this too:

1. the RSB is flushed when CR3 is updated
2. the RSB is flushed when INVPCID is issued (except type 0 - single
address).

I didn't mention 1. so far, which led to your question, right? Does
this now cover all the cases?

Amit