Re: [PATCH v4 3/5] x86/umwait: Add sysfs interface to control umwait C0.2 state
From: Fenghua Yu
Date: Mon Jun 10 2019 - 00:06:44 EST
On Sat, Jun 08, 2019 at 03:50:32PM -0700, Andy Lutomirski wrote:
> On Fri, Jun 7, 2019 at 3:10 PM Fenghua Yu <fenghua.yu@xxxxxxxxx> wrote:
> > C0.2 state in umwait and tpause instructions can be enabled or disabled
> > on a processor through IA32_UMWAIT_CONTROL MSR register.
> > By default, C0.2 is enabled and the user wait instructions result in
> > lower power consumption with slower wakeup time.
> > But in real time systems which require faster wakeup time although power
> > savings could be smaller, the administrator needs to disable C0.2 and all
> > C0.2 requests from user applications revert to C0.1.
> > A sysfs interface "/sys/devices/system/cpu/umwait_control/enable_c02" is
> > created to allow the administrator to control C0.2 state during run time.
> This looks better than the previous version. I think the locking is
> still rather confused. You have a mutex that you hold while changing
> the value, which is entirely reasonable. But, of the code paths that
> write the MSR, only one takes the mutex.
> I think you should consider making a function that just does:
> wrmsr(MSR_IA32_UMWAIT_CONTROL, READ_ONCE(umwait_control_cached), 0);
> and using it in all the places that update the MSR. The only thing
> that should need the lock is the sysfs code to avoid accidentally
> corrupting the value, but that code should also use WRITE_ONCE to do
> its update.
Based on the comment, the illustrative CPU online and enable_c02 store
functions would be:
wrmsr(MSR_IA32_UMWAIT_CONTROL, READ_ONCE(umwait_control_cached), 0);
umwait_control_c02 = (u32)!c02_enabled;
WRITE_ONCE(umwait_control_cached, 2 | get_umwait_control_max_time());
on_each_cpu(umwait_control_msr_update, NULL, 1);
Then suppose umwait_control_cached = 100000 initially and only CPU0 is
running. Admin change bit 0 in MSR from 0 to 1 to disable C0.2 and is
onlining CPU1 in the same time:
1. On CPU1, read umwait_control_cached to eax as 100000 in
2. On CPU0, write 100001 to umwait_control_cached in enable_c02_store()
3. On CPU1, wrmsr with eax=100000 in umwaint_cpu_online()
4. On CPU0, wrmsr with 100001 in enabled_c02_store()
The result is CPU0 and CPU1 have different MSR values.
The problem is because there is no wrmsr serialization b/w uwait_cpu_online()
and enable_c02_store(). The WRITE_ONCE() and READ_ONCE() only serialize
access to umwait_control_cached. But we need to serialize wrmsr() as well to
guarantee all CPUs have the same MSR value.
So does it make sense to keep the mutex and locking as the current patch does?