Re: [PATCH v1 3/9]powerpc/powernv: Add cpu hotplug support

From: Daniel Axtens
Date: Tue Jun 02 2015 - 19:39:24 EST


On Tue, 2015-06-02 at 21:29 +0530, Madhavan Srinivasan wrote:
> Patch adds cpu hotplug support. First online cpu in a node is picked as
> designated thread to read the Nest pmu counter data, and at the time of
> hotplug, next online cpu from the same node is picked up.

I'm not sure I understand this commit message. I think I understand the
first half - I think you're trying to say: "At boot, the first online
CPU in a node is picked as the designated thread to read the Nest PMU
counter data." I'm not sure I understand the second half: "picked up"
how and for what?

(I did eventually figure it out by reading the patch, but it'd be really
nice to have it spelled out nicely in the commit message.)

> +static void nest_exit_cpu(int cpu)
> +{
> + int i, nid, target = -1;
> + const struct cpumask *l_cpumask;
> + int src_chipid;
> +
> + if (!cpumask_test_and_clear_cpu(cpu, &cpu_mask_nest_pmu))
> + return;
> +
> + nid = cpu_to_node(cpu);
> + src_chipid = topology_physical_package_id(cpu);
> + l_cpumask = cpumask_of_node(nid);
> + for_each_cpu(i, l_cpumask) {
> + if (i == cpu)
> + continue;
> + if (src_chipid == topology_physical_package_id(i)) {
> + target = i;
> + break;
> + }
> + }

Some comments here would really help. I think you're looking for the
first CPU that's (a) not the cpu you're removing and (b) on the same
physical package, so sharing the same nest, but it took me a lot of
staring at the code to figure it out.

> +
> + cpumask_set_cpu(target, &cpu_mask_nest_pmu);
> + nest_change_cpu_context (cpu, target);
> + return;
Return is redundant here and in several other functions in this patch.
> +}
> +
> +static void nest_init_cpu(int cpu)
> +{
> + int i, src_chipid;
> +
> + src_chipid = topology_physical_package_id(cpu);
> + for_each_cpu(i, &cpu_mask_nest_pmu)
> + if (src_chipid == topology_physical_package_id(i))
> + return;
> +
> + cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);
> + nest_change_cpu_context ( -1, cpu);
Weird extra spaces here.

> + return;
> +}
This function could also do with a comment: AFAICT, you've structured
the function so that it only calls nest_change_cpu_context if you've
picked up a cpu on a physical package that previously didn't have a nest
pmu thread on it.

> +
> +static int nest_cpu_notifier(struct notifier_block *self,
> + unsigned long action, void *hcpu)
> +{
> + unsigned int cpu = (long)hcpu;
What's with this cast? You cast it to a long and then assign it to an
unsigned int?
> +
> + switch (action & ~CPU_TASKS_FROZEN) {
> + case CPU_DOWN_FAILED:
Is it necessary to move the thread back if the CPU fails to go down?
You've moved it to another online CPU already; what's the benefit of
paying the time-penalty to move it back?
> + case CPU_STARTING:
> + nest_init_cpu(cpu);
> + break;
> + case CPU_DOWN_PREPARE:
> + nest_exit_cpu(cpu);
> + break;
> + default:
> + break;
> + }
> +
> + return NOTIFY_OK;
> +}

>

Now, I don't know the details of CPU hotplug _at all_, so this may be
stupid, but what happens if you hotplug a lot of CPUs all at once? Is
everything properly serialised or is this going to race and end up with
either multiple cpus trying to do PMU or no cpus?

Regards,
Daniel Axtens


Attachment: signature.asc
Description: This is a digitally signed message part