Re: [PATCH v1 3/9]powerpc/powernv: Add cpu hotplug support

From: Madhavan Srinivasan
Date: Thu Jun 04 2015 - 04:34:16 EST




On Wednesday 03 June 2015 05:08 AM, Daniel Axtens wrote:
On Tue, 2015-06-02 at 21:29 +0530, Madhavan Srinivasan wrote:
Patch adds cpu hotplug support. First online cpu in a node is picked as
designated thread to read the Nest pmu counter data, and at the time of
hotplug, next online cpu from the same node is picked up.
I'm not sure I understand this commit message. I think I understand the
first half - I think you're trying to say: "At boot, the first online
I will rephrase it.

CPU in a node is picked as the designated thread to read the Nest PMU
counter data." I'm not sure I understand the second half: "picked up"
how and for what?
When the designated thread is hotplugged, next online cpu in the
same node is picked up as the designated thread to read the PMU counter data.

(I did eventually figure it out by reading the patch, but it'd be really
nice to have it spelled out nicely in the commit message.)
Sure. Will fix the commit message.

+static void nest_exit_cpu(int cpu)
+{
+ int i, nid, target = -1;
+ const struct cpumask *l_cpumask;
+ int src_chipid;
+
+ if (!cpumask_test_and_clear_cpu(cpu, &cpu_mask_nest_pmu))
+ return;
+
+ nid = cpu_to_node(cpu);
+ src_chipid = topology_physical_package_id(cpu);
+ l_cpumask = cpumask_of_node(nid);
+ for_each_cpu(i, l_cpumask) {
+ if (i == cpu)
+ continue;
+ if (src_chipid == topology_physical_package_id(i)) {
+ target = i;
+ break;
+ }
+ }
Some comments here would really help. I think you're looking for the
first CPU that's (a) not the cpu you're removing and (b) on the same
physical package, so sharing the same nest, but it took me a lot of
staring at the code to figure it out.
My bad. I will comment it.

+
+ cpumask_set_cpu(target, &cpu_mask_nest_pmu);
+ nest_change_cpu_context (cpu, target);
+ return;
Return is redundant here and in several other functions in this patch.
Ok.

+}
+
+static void nest_init_cpu(int cpu)
+{
+ int i, src_chipid;
+
+ src_chipid = topology_physical_package_id(cpu);
+ for_each_cpu(i, &cpu_mask_nest_pmu)
+ if (src_chipid == topology_physical_package_id(i))
+ return;
+
+ cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);
+ nest_change_cpu_context ( -1, cpu);
Weird extra spaces here.
Yes. Nice catch. Will fix it.

+ return;
+}
This function could also do with a comment: AFAICT, you've structured
the function so that it only calls nest_change_cpu_context if you've
picked up a cpu on a physical package that previously didn't have a nest
pmu thread on it.

+
+static int nest_cpu_notifier(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ unsigned int cpu = (long)hcpu;
What's with this cast? You cast it to a long and then assign it to an
unsigned int?

Facepalm. My bad, will fix it.
+
+ switch (action & ~CPU_TASKS_FROZEN) {
+ case CPU_DOWN_FAILED:
Is it necessary to move the thread back if the CPU fails to go down?
No. not need.
You've moved it to another online CPU already; what's the benefit of
paying the time-penalty to move it back?
Why should go through that. Because, there is no restriction saying only the first
cpu has to read it, why should we complicate it further instead of moving to another
cpu in the same node.

+ case CPU_STARTING:
+ nest_init_cpu(cpu);
+ break;
+ case CPU_DOWN_PREPARE:
+ nest_exit_cpu(cpu);
+ break;
+ default:
+ break;
+ }
+
+ return NOTIFY_OK;
+}
Now, I don't know the details of CPU hotplug _at all_, so this may be
stupid, but what happens if you hotplug a lot of CPUs all at once? Is
everything properly serialised or is this going to race and end up with
either multiple cpus trying to do PMU or no cpus?
I did test the code with hotplug test. If all the cpus in the node is offlined,
then we will have no cpus designated for that node.

Thanks for review
Maddy
Regards,
Daniel Axtens



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/