Re: [PATCH 1/2] x86: cpu/bugs: add support for AMD ERAPS feature
From: Shah, Amit
Date: Tue Nov 05 2024 - 11:25:28 EST
On Tue, 2024-11-05 at 08:19 -0800, Dave Hansen wrote:
> On 11/5/24 02:39, Shah, Amit wrote:
> > On Mon, 2024-11-04 at 09:45 -0800, Dave Hansen wrote:
> > I'm expecting the APM update come out soon, but I have put together
> >
> > https://amitshah.net/2024/11/eraps-reduces-software-tax-for-hardware-bugs/
> >
> > based on information I have. I think it's mostly consistent with
> > what
> > I've said so far - with the exception of the mov-CR3 flush only
> > confirmed yesterday.
>
> That's better. But your original cover letter did say:
>
> Feature documented in AMD PPR 57238.
>
> which is technically true because the _bit_ is defined. But it's
> far,
> far from being sufficiently documented for Linux to actually use it.
Yea; apologies.
> Could we please be more careful about these in the future?
>
> > > So, I'll flip this back around. Today, X86_FEATURE_RSB_CTXSW
> > > zaps
> > > the
> > > RSB whenever RSP is updated to a new task stack. Please convince
> > > me
> > > that ERAPS provides superior coverage or is unnecessary in all
> > > the
> > > possible combinations switching between:
> > >
> > > different thread, same mm
> >
> > This case is the same userspace process with valid addresses in the
> > RSB
> > for that process. An invalid speculation isn't security sensitive,
> > just a misprediction that won't be retired. So we are good here.
>
> Does that match what the __switch_to_asm comment says, though?
>
> > /*
> > * When switching from a shallower to a deeper call stack
> > * the RSB may either underflow or use entries populated
> > * with userspace addresses. On CPUs where those concerns
> > * exist, overwrite the RSB with entries which capture
> > * speculative execution to prevent attack.
> > */
>
> It is also talking just about call depth, not about same-address-
> space
> RSB entries being harmless. That's because this is also trying to
> avoid
> having the kernel consume any user-placed RSB entries, regardless of
> whether they're from the same mm or not.
>
> > > user=>kernel, same mm
> > > kernel=>user, same mm
> >
> > user-kernel is protected with SMEP. Also, we don't call
> > FILL_RETURN_BUFFER for these switches?
>
> Amit, I'm beginning to fear that you haven't gone and looked at the
> relevant code here. Please go look at
> SYM_FUNC_START(__switch_to_asm)
> in arch/x86/entry/entry_64.S. I believe this code is called for all
> task switches, including switching from a user task to a kernel
> task. I
> also believe that FILL_RETURN_BUFFER is used unconditionally for
> every
> __switch_to_asm call (when X86_FEATURE_RSB_CTXSW is on of course).
>
> Could we please start over on this patch?
>
> Let's get the ERAPS+TLB-flush nonsense out of the kernel and get the
> commit message right.
>
> Then let's go from there.
Alright - you've been really patient, so thanks for that. I agree I'll
post a v2 with updated commit messages, and then continue this
discussion on user/kernel task switch. And I'll also add an RFC tag to
it to ensure it doesn't get picked up.
Amit