Re: [PATCH] cpufreq, store_scaling_governor requires policy->rwsem to be held for duration of changing governors [v2]

From: Saravana Kannan
Date: Tue Aug 05 2014 - 18:51:26 EST


On 08/05/2014 03:42 PM, Prarit Bhargava wrote:
So back to the original problem, which in short, revolves around these two
functions (with comments included by me):

/* called with kernfs rwsem for this kobj held */
static ssize_t show(struct kobject *kobj, struct attribute *attr, char *buf)
{
struct cpufreq_policy *policy = to_policy(kobj);
struct freq_attr *fattr = to_attr(attr);
ssize_t ret;

if (!down_read_trylock(&cpufreq_rwsem))
return -EINVAL;

down_read(&policy->rwsem);


and

static ssize_t store(struct kobject *kobj, struct attribute *attr,
const char *buf, size_t count)
{
struct cpufreq_policy *policy = to_policy(kobj);
struct freq_attr *fattr = to_attr(attr);
ssize_t ret = -EINVAL;

get_online_cpus();

if (!cpu_online(policy->cpu))
goto unlock;

if (!down_read_trylock(&cpufreq_rwsem))
goto unlock;

down_write(&policy->rwsem);

/* if governor switch, calls sysfs_remove_group
* and acquires kernfs rwsem for this kobj */

There's another bug here which I haven't had a chance to discuss. There's the
possibility that we have multiple threads waiting at the store's
down_write(&policy->rwsem) when another thread does a governor switch. When
the policy->rwsem is released by the governor switch thread, all the other
threads will enter this code path and race through while looking at stale data.

We hit some NULL pointers (very similar to the ones originally reported in this
thread) and, of course, the system dies.

I wonder if the show() down_read(&policy->rwsem) needs to be a
down_read_trylock(), and similarily in the store the down_write(&policy->rwsem)
needs to be a down_write_trylock() ?

This will create bigger issues if you make this change to the generic show/store. The writes would no longer be reliable even if the race could have been handled properly in the kernel (say, a race between a call to cpufreq_update_policy() and user space reading/writing something). This would be a serious userspace ABI change.

We would also have to do a check on policy->governor_enabled to verify that
the data was still valid after taking the lock.

*If* we were to make this change, does that also happen to fix the risk of a
deadlock in this code?

We should not do the change you are referring to.


That might be too hacky ... gotta be a better way :/ ...

Anyway, just a thought.


I definitely have a fix for this and the original race you reported. It's basically reverting that commit you reverted + a fix for the deadlock. That's the only way to fix the scaling_governor issue.

You fix the deadlock by moving the governor attribute group removing to the framework code and doing it before STOP+EXIT to governor without holding the policy lock. And the reverse for INIT+STOP.

I'll try to get to it if no one else does.

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/