Re: [PATCH v5 1/1] x86/cpufeatures: Implement Predictive Store Forwarding control.

From: Thomas Gleixner
Date: Fri May 07 2021 - 11:13:39 EST


On Wed, May 05 2021 at 14:09, Ramakrishna Saripalli wrote:
> From: Ramakrishna Saripalli <rk.saripalli@xxxxxxx>
>
> Certain AMD processors feature a new technology called Predictive Store
> Forwarding (PSF).
>
> PSF is a micro-architectural optimization designed to improve the
> performance of code execution by predicting dependencies between
> loads and stores.
>
> Incorrect PSF predictions can occur due to two reasons.
>
> - It is possible that the load/store pair may have had dependency for
> a while but the dependency has stopped because the address in the
> load/store pair has changed.
>
> - Second source of incorrect PSF prediction can occur because of an alias
> in the PSF predictor structure stored in the microarchitectural state.
> PSF predictor tracks load/store pair based on portions of instruction
> pointer. It is possible that a load/store pair which does have a
> dependency may be aliased by another load/store pair which does not have
> the same dependency. This can result in incorrect speculation.
>
> Software may be able to detect this aliasing and perform side-channel
> attacks.

So this is the new post spectre/meltdown/.../ world order.

What would have been considered a potential speculative side channel bug
two years ago is now sold a feature which is by default enabled.

Just to be clear. From a security POV this is just yet another
default enabled speculative vulnerability. The difference to the others
is that this is communicated upfront and comes with a knob to turn it
off right away.

There is also interaction with SSB and the SSB mitigation which is
described in the cover letter, but not in the changelog and is not
detectable from user space.

I know that you had it implemented that way in your first attempt, but I
was busy with other things and missed the discussion which resulted in
this being treated as a feature.

TBH, I'm not really happy about this because that's inconsistent with
how we treat the other speculation related issues and there is no way
for user space to actually check this like the other one via /sys/....

> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -1170,3 +1170,22 @@ void set_dr_addr_mask(unsigned long mask, int dr)
> break;
> }
> }
> +
> +static int __init psf_cmdline(char *str)
> +{
> + if (!boot_cpu_has(X86_FEATURE_PSFD))
> + return 0;
> +
> + if (!str)
> + return -EINVAL;
> +
> + if (!strcmp(str, "off")) {
> + set_cpu_cap(&boot_cpu_data, X86_FEATURE_MSR_SPEC_CTRL);

What? Why is this setting this feature here and why is this not done in
init_speculation_control() as for all the other speculation misfeatures?

> + x86_spec_ctrl_base |= SPEC_CTRL_PSFD;

What? See below.

> + msr_set_bit(MSR_IA32_SPEC_CTRL, SPEC_CTRL_PSFD_SHIFT);
> + }
> +
> + return 0;

So any parameter is treated as valid here. That's interesting at best.

> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -78,6 +78,8 @@ EXPORT_SYMBOL_GPL(mds_idle_clear);
>
> void __init check_bugs(void)
> {
> + u64 tmp = 0;
> +
> identify_boot_cpu();
>
> /*
> @@ -97,7 +99,9 @@ void __init check_bugs(void)
> * init code as it is not enumerated and depends on the family.
> */
> if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL))
> - rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
> + rdmsrl(MSR_IA32_SPEC_CTRL, tmp);
> +
> + x86_spec_ctrl_base |= tmp;

How is anyone supposed to understand that logic?

Just because x86_spec_ctrl_base is a global variable does not justify
this hackery at all. It's just a matter of time that someone reads this:

u64 x86_spec_ctrl_base;

void __init check_bugs(void)
{
u64 tmp = 0;

...

if (boot_cpu_has(X86_FEATURE_MSR_SPEC_CTRL))
rdmsrl(MSR_IA32_SPEC_CTRL, tmp);

x86_spec_ctrl_base |= tmp;

and figures that this is a pointless exercise and reverts that hunk.

What's wrong with just treating this in the same way in which we treat
all other speculative vulnerabilities and provide a consistent picture
to the user?

Something like the below. You get the idea.

Thanks,

tglx
---
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -111,6 +111,7 @@ void __init check_bugs(void)
mds_select_mitigation();
taa_select_mitigation();
srbds_select_mitigation();
+ psf_select_mitigation();

/*
* As MDS and TAA mitigations are inter-related, print MDS
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -891,6 +891,9 @@ static void init_speculation_control(str
set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
clear_cpu_cap(c, X86_FEATURE_VIRT_SSBD);
}
+
+ if (!boot_cpu_has(X86_FEATURE_PSFD))
+ set_cpu_cap(c, X86_FEATURE_MSR_SPEC_CTRL);
}

void get_cpu_cap(struct cpuinfo_x86 *c)