Re: [PATCH 1/1] x86/CPU/AMD: Avoid racy updates to MSR_K7_HWCR in set_cpuid_faulting()
From: Borislav Petkov
Date: Tue Jun 09 2026 - 19:30:10 EST
On Tue, Jun 09, 2026 at 03:08:22PM -0700, Jim Mattson wrote:
> Yes, I'd be happy to.
Thanks!
I haven't heard that in a while - it is usually grumbling :-P
> Most HWCR updates are in init code, so they should be fine. Should I
> call the helper from init code anyway?
Pfff, looking around the tree, we've accumulated a lot of HWCR pokers... I'm
thinking perhaps we should leave the obviously non-problematic ones alone for
now and convert only the problematic ones at first and then slowly, convert
the rest. Otherwise, that'll be a bunch of patch churn at once...
> If this can race with boost_set_msr_each(),
You mean if someone echoes into sysfs to toggle boost and yanks the module
at the same time? Pff, I fail to see the valid use case here but I won't be
surprised.
> then the bigger problem is that boost_set_msr_each() might set the
> boost-disable bit between the call to cpufreq_boost_down_prep() and the CPU
> going offline.
Offline? I think that's the module exit callback acpi_cpufreq_cpu_exit() which
calls it.
/me greps some more...
Uff, that's hit down the cpuhp_cpufreq_offline() path which is the hotplug
callback.
> I couldn't convince myself that cpufreq_boost_down_prep() was always
> called on the CPU going offline. The call to boost_set_msr() on the
> current CPU suggests it is, but the 'cpu' argument suggests it isn't.
> If policies are shared, my reading of __cpufreq_offline() is that
> cpufreq_driver->exit() [which calls cpufreq_boost_down_prep()] is only
> called when the last CPU sharing the policy goes offline, potentially
> leaving CPB_DIS set on the rest.
Yah, I see it the same. Unless we both are missing something, this looks
broken and prolly no one caught it because we're using amd-state now. And that
thing doesn't even touch CPB_DIS but does this pm_qos stuff in
amd_pstate_cpu_boost_update().
Dunno, do we care about a driver we don't use anymore... maybe, but low
prio...
> Anyway, since boost_set_msr() is called from both process context and
> interrupt context, perhaps I should use the helper in boost_set_msr()
> as well, just for consistency.
Yah, stick the call in the case ...HYGON/...AMD: branch and should be good.
I can test it then on some machines here.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette