Re: [PATCH v2] cpufreq: Avoid a couple of races related to cpufreq_cpu_get()

From: Viresh Kumar
Date: Thu Nov 17 2016 - 12:57:18 EST


On 17-11-16, 14:35, Rafael J. Wysocki wrote:
> That unless cpu == policy->cpu and it is going offline I suppose?
>
> The scenario is as follows. cpufreq_get() is invoked for policy->cpu
> and cpufreq_offline() runs for it at the same time.
>
> cpufreq_get() calls cpufreq_cpu_get() which does the policy->cpus
> check which passes, because cpufreq_offline() hasn't updated the mask
> yet. Now cpufreq_offline() updates the mask and proceeds with
> cpufreq_driver->stop_cpu() and cpufreq_driver->exit(). Then, it drops
> the lock.
>
> cpufreq_get() acquires the lock. The policy is still there, but it
> may be inactive at this point. Still, cpufreq_get() doesn't check
> that, but invokes __cpufreq_get() unconditionally, which calls
> cpufreq_driver->get(policy->cpu). Is this still guaranteed to work?
> I don't think so.
>
> It looks like a policy_is_inactive() check should be there in
> cpufreq_get() at least.

Okay, trying to do any operations on the device for an inactive policy is
absolutely wrong. I agree.

> >> +
> >> up_read(&policy->rwsem);
> >>
> >> cpufreq_cpu_put(policy);
> >> @@ -2142,6 +2154,11 @@ int cpufreq_get_policy(struct cpufreq_po
> >> if (!cpu_policy)
> >> return -EINVAL;
> >>
> >> + if (!cpumask_test_cpu(cpu, policy->cpus)) {
> >> + cpufreq_cpu_put(cpu_policy);
> >> + return -EINVAL;
> >> + }
> >> +
> >
> > We are just copying the policy here, so it should be always safe.
>
> So the check is not necessary at all?

Right.

> Say the CPU is the only one in the policy and it is going offline.
>
> cpufreq_update_policy() is invoked at the same time and calls
> cpufreq_cpu_get() which checks policy->cpus and the test passes,
> because cpufreq_offline() hasn't updated the mask yet. The
> cpufreq_offline() updates the mask and the policy becomes inactive,
> but there are no checks for that going forward, unless Im overlooking
> something again.

Same here. I agree.

> > Also, even if we have some real cases for cpufreq_cpu_get_raw(), which
> > needs to get fixed, I believe that we can move the check to
> > cpufreq_cpu_get() and not to every caller.
>
> I disagree, but for now I'm going to leave cpufreq_cpu_get() alone.
> To me, the policy->cpus check in cpufreq_cpu_get_raw() is just
> confusing (it isn't even needed by some callers of that function),
> which is the reason why I'd prefer to get rid of it.

Okay.

> I'll add policy_is_inactive() checks to cpufreq_get() and
> cpufreq_update_policy() at this point.

That would be much better I think.

--
viresh