Re: [PATCH -next] power/qos: fix a data race in pm_qos_*_value
From: Qian Cai
Date:  Mon Feb 24 2020 - 14:02:46 EST
On Mon, 2020-02-24 at 10:54 +0100, Rafael J. Wysocki wrote:
> On Mon, Feb 24, 2020 at 2:01 AM Qian Cai <cai@xxxxxx> wrote:
> > 
> > 
> > 
> > > On Feb 23, 2020, at 7:12 PM, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > > 
> > > It may be a bug under certain conditions, but you don't mention what
> > > conditions they are.  Reporting it as a general bug is not accurate at
> > > the very least.
> > 
> > Could we rule out load tearing, store tearing and reload of global_req in cpuidle_governor_latency() for all compilers and architectures which could introduce logic bugs?
> > 
> >         int global_req = cpu_latency_qos_limit();
> > 
> >         if (device_req > global_req)
> >                 device_req = global_req;
> > 
> > If under register pressure, the compiler might get ride of the tmp variable, i.e.,
> > 
> > If (device_req > cpu_latency_qos_limit())
> > â-> race with the writer.
> >          device_req = cpu_latency_qos_limit();
> 
> Yes, there is a race here with or without the WRITE_ONCE()/READ_ONCE()
> annotations (note that these annotations don't prevent CPUs from
> reordering things, so device_req may be set before global_req
> regardless).
> 
> However, worst-case it may cause an old value to be used and that can
> happen anyway if the entire cpuidle_governor_latency_req() runs
> between the curr_value update and pm_qos_set_value() in
> pm_qos_update_target(), for example.
> 
> IOW, there is no guarantee that the new value will be used immediately
> after updating a QoS request anyway.
> 
> I agree with adding the annotations (I was considering posting a patch
> doing that myself), but just as a matter of making the intention
> clear.
OK, how about this updated texts?
[PATCH -next] power/qos: annotate a data race in pm_qos_*_value
cpu_latency_constraints.target_value could be accessed concurrently via,
cpu_latency_qos_apply
 pm_qos_update_target
   pm_qos_set_value
cpuidle_governor_latency_req
  cpu_latency_qos_limit
    pm_qos_read_value
The read is outside pm_qos_lock critical section which results in a data race.
However, the worst case is that an old value to be used and that can happen
anyway, so annotate this data race using a pair of READ|WRITE_ONCE().