On Tue, 2015-06-02 at 21:29 +0530, Madhavan Srinivasan wrote:I will rephrase it.
Patch adds cpu hotplug support. First online cpu in a node is picked asI'm not sure I understand this commit message. I think I understand the
designated thread to read the Nest pmu counter data, and at the time of
hotplug, next online cpu from the same node is picked up.
first half - I think you're trying to say: "At boot, the first online
CPU in a node is picked as the designated thread to read the Nest PMUWhen the designated thread is hotplugged, next online cpu in the
counter data." I'm not sure I understand the second half: "picked up"
how and for what?
(I did eventually figure it out by reading the patch, but it'd be reallySure. Will fix the commit message.
nice to have it spelled out nicely in the commit message.)
My bad. I will comment it.+static void nest_exit_cpu(int cpu)Some comments here would really help. I think you're looking for the
+{
+ int i, nid, target = -1;
+ const struct cpumask *l_cpumask;
+ int src_chipid;
+
+ if (!cpumask_test_and_clear_cpu(cpu, &cpu_mask_nest_pmu))
+ return;
+
+ nid = cpu_to_node(cpu);
+ src_chipid = topology_physical_package_id(cpu);
+ l_cpumask = cpumask_of_node(nid);
+ for_each_cpu(i, l_cpumask) {
+ if (i == cpu)
+ continue;
+ if (src_chipid == topology_physical_package_id(i)) {
+ target = i;
+ break;
+ }
+ }
first CPU that's (a) not the cpu you're removing and (b) on the same
physical package, so sharing the same nest, but it took me a lot of
staring at the code to figure it out.
Ok.+Return is redundant here and in several other functions in this patch.
+ cpumask_set_cpu(target, &cpu_mask_nest_pmu);
+ nest_change_cpu_context (cpu, target);
+ return;
Yes. Nice catch. Will fix it.+}Weird extra spaces here.
+
+static void nest_init_cpu(int cpu)
+{
+ int i, src_chipid;
+
+ src_chipid = topology_physical_package_id(cpu);
+ for_each_cpu(i, &cpu_mask_nest_pmu)
+ if (src_chipid == topology_physical_package_id(i))
+ return;
+
+ cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);
+ nest_change_cpu_context ( -1, cpu);
+ return;This function could also do with a comment: AFAICT, you've structured
+}
the function so that it only calls nest_change_cpu_context if you've
picked up a cpu on a physical package that previously didn't have a nest
pmu thread on it.
+What's with this cast? You cast it to a long and then assign it to an
+static int nest_cpu_notifier(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ unsigned int cpu = (long)hcpu;
unsigned int?
No. not need.+Is it necessary to move the thread back if the CPU fails to go down?
+ switch (action & ~CPU_TASKS_FROZEN) {
+ case CPU_DOWN_FAILED:
You've moved it to another online CPU already; what's the benefit ofWhy should go through that. Because, there is no restriction saying only the first
paying the time-penalty to move it back?
I did test the code with hotplug test. If all the cpus in the node is offlined,+ case CPU_STARTING:Now, I don't know the details of CPU hotplug _at all_, so this may be
+ nest_init_cpu(cpu);
+ break;
+ case CPU_DOWN_PREPARE:
+ nest_exit_cpu(cpu);
+ break;
+ default:
+ break;
+ }
+
+ return NOTIFY_OK;
+}
stupid, but what happens if you hotplug a lot of CPUs all at once? Is
everything properly serialised or is this going to race and end up with
either multiple cpus trying to do PMU or no cpus?
Regards,
Daniel Axtens