Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend

From: Saravana Kannan
Date: Mon Jul 14 2014 - 15:08:39 EST

Next message: Tim Chen: "Re: [PATCH v4 6/7] sched: add function nr_running_cpu to expose number of tasks running on cpu"
Previous message: Murphy, Dan: "Re: [PATCH v6] ASoC: tas2552: Support TI TAS2552 Amplifier"
In reply to: Viresh Kumar: "Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend"
Next in thread: Viresh Kumar: "Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 07/13/2014 11:09 PM, Viresh Kumar wrote:

On 12 July 2014 08:14, Saravana Kannan <skannan@xxxxxxxxxxxxxx> wrote:

I'm just always adding the real nodes to the first CPU in a cluster
independent of which CPU gets added first. Makes it easier to know which
ones to symlink. See comment next to policy->cpu for full context.

Yeah, and that is the order in which CPUs will boot and cpufreq_add_dev()
will be called. So, isn't policy->cpu the right CPU always?

No, the "first" cpu in a cluster doesn't need to be the first one to be
added. An example is 2x2 cluster system where the system is booted with max
cpus = 2 and then cpu3 could be onlined first by userspace.

Because we are getting rid of much of the complexity now, I do not want
policy->cpu to keep changing. Just fix it up to the cpu for which the policy
gets created first. That's it. No more changes required. It doesn't matter at
userspace which cpu owns it as symlinks would anyway duplicate it under
every cpu.

I think you missed one my of comments in the email. I agree with what you are saying here. I'll just do it as a separate patch to keep this one simpler. I don't want to touch all the governors and other potential uses of policy->cpu in this patch.

Yeah, it is pretty convolution. But pretty much anywhere in the gov code
where policy->cpu is used could cause this. The specific crash I hit was in
this code:

static void od_dbs_timer(struct work_struct *work)
{
struct od_cpu_dbs_info_s *dbs_info =
container_of(work, struct od_cpu_dbs_info_s,
cdbs.work.work);
unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;

======= CPU is policy->cpu here.

struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
cpu);

======= Picks the per CPU struct of an offline CPU

<snip>

mutex_lock(&core_dbs_info->cdbs.timer_mutex);

======= Dies trying to lock a destroyed mutex

I am still not getting it. Why would we get into this if policy->cpu is fixed
once at boot ?

Yeah, it definitely crashes if policy->cpu if an offline cpu. Because the mutex would be uninitialized if it's stopped after boot or it would never have been initialized (depending on how you fix policy->cpu at boot).

Look at this snippet on the actual tree and it should be pretty evident.

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Tim Chen: "Re: [PATCH v4 6/7] sched: add function nr_running_cpu to expose number of tasks running on cpu"
Previous message: Murphy, Dan: "Re: [PATCH v6] ASoC: tas2552: Support TI TAS2552 Amplifier"
In reply to: Viresh Kumar: "Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend"
Next in thread: Viresh Kumar: "Re: [PATCH v2] cpufreq: Don't destroy/realloc policy/sysfs on hotplug/suspend"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]