Re: CPU Hotplug: Hotplug Script And SIGPWR

From: Tim Hockin
Date: Wed Jan 21 2004 - 02:11:34 EST


On Wed, Jan 21, 2004 at 10:39:33AM +0530, Srivatsa Vaddagiri wrote:
> > *Before* that happens, tasks that don't handle the signal should just
> > have their affinity changed to all cpus.
>
> Currently, handle or not handle the signal, affinity is changed
> to all cpus for tasks that are bound only to the dying CPU.

OK, so lets assume this scenarion:

process A affined to cpu1
all other processes affined to 0xffffffff
cpu1 goes down
- process A affined to 0xffffffff
hotplug "cpu1 removed" event
cpu1 comes back
hotplug "cpu1 inserted" event

Process A has now discarded useful potentially VALUABLE information, with no
way to retrieve it. The hot plug scripts do not have enough information to
put things the way they were before. I can't believe that anyone considers
this to be OK.

Userspace gave us EXPLICIT instructions, which we then violate. By granting
affinity, we have made a contract with userspace. Changing affinity without
userspace's direct instruction is wrong.

What about this:

We already can not handle unexpected CPU removals gracefully, correct? So
we expect some user-provided notification, right?

So force userland to handle it before we give the OK to remove a CPU.

pid_t sys_proc_offline(int cpu)
{
pid_t p;

/* flag cpu as not schedulale anymore */
dont_add_tasks_to(cpu);

p = find_first_unrunnable(cpu);
if (p)
return p;

take_proc_offline(cpu);
return 0;
}

The userspace control can then loop on this until it returns 0. Each time
it return a pid, userspace must try to handle that pid - kill it,
re-affine it, or provide some way to suspend it.

Simpler yet:

int sys_proc_offline(int cpu, int reaffine)
{
pid_t p;

/* flag cpu as not schedulale anymore */
dont_add_tasks_to(cpu);

while ((p = find_first_unrunnable(cpu))) {
if (reaffine)
reaffine(p);
else
make_unrunnable(p);
}

take_proc_offline(cpu);
return 0;
}

Less flexible, but workable. I prefer the first. Yes it's racy, but the
worst case is that you receive a pid that you don't need to handle (died or
re-affined already).

Anything that violates affinity without permission just is so WRONG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/