Re: [PATCH 3/4] PM / OPP: take RCU lock in dev_pm_opp_get_opp_count

From: Dmitry Torokhov
Date: Wed Dec 24 2014 - 12:09:19 EST

On Wed, Dec 24, 2014 at 8:48 AM, Nishanth Menon <nm@xxxxxx> wrote:
> On 12/16/2014 05:09 PM, Dmitry Torokhov wrote:
>> A lot of callers are missing the fact that dev_pm_opp_get_opp_count
>> needs to be called under RCU lock. Given that RCU locks can safely be
>> nested, instead of providing *_locked() API, let's take RCU lock inside
>> dev_pm_opp_get_opp_count() and leave callers as is.
> While it is true that we can safely do nested RCU locks, This also
> encourages wrong usage.
> count = dev_pm_opp_get_opp_count(dev)
> ^^ point A
> array = kzalloc(count * sizeof (*array));
> rcu_read_lock();
> ^^ point B
> .. work down the list and add OPPs..
> ...
> Between A and B, we might have had list modification (dynamic OPP
> addition or deletion) - which implies that the count is no longer
> accurate between point A and B. instead, enforcing callers to have the
> responsibility of rcu_lock is exactly what we have to do since the OPP
> library has no clue how to enforce pointer or data accuracy.

No, you seem to have a misconception that rcu_lock protects you past
the point B, but that is also wrong. The only thing rcu "lock"
provides is safe traversing the list and guarantee that elements will
not disappear while you are referencing them, but list can both
contract and expand under you. In that regard code in
drivers/cpufreq/cpufreq_opp.c is utterly wrong. If you want to count
the list and use number of elements you should be taking a mutex.
Luckily all cpufreq drivers at the moment only want to see if OPP
table is empty or not, so as a stop-gap we can take rcu_lock
automatically as we are getting count. We won't get necessarily
accurate result, but at least we will be safe traversing the list.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at