Re: [RFC][PATCH 06/22] sched: SCHED_DEADLINE handles spacialkthreads

From: Oleg Nesterov
Date: Sat Nov 13 2010 - 15:06:39 EST


On 11/13, Peter Zijlstra wrote:
>
> Something like so?.. hasn't even seen a compiler yet but one's got to do
> something to keep the worst bore of saturday night telly in check ;-)

Yes, I _think_ this all can work (and imho makes a lot of sense
if it works).

quick and dirty review below ;)

> struct take_cpu_down_param {
> - struct task_struct *caller;
> unsigned long mod;
> void *hcpu;
> };
> @@ -208,11 +207,6 @@ static int __ref take_cpu_down(void *_pa
>
> cpu_notify(CPU_DYING | param->mod, param->hcpu);
>
> - if (task_cpu(param->caller) == cpu)
> - move_task_off_dead_cpu(cpu, param->caller);
> - /* Force idle task to run as soon as we yield: it should
> - immediately notice cpu is offline and die quickly. */
> - sched_idle_next();

Yes. but we should remove "while (!idle_cpu(cpu))" from _cpu_down().

> @@ -2381,18 +2381,15 @@ static int select_fallback_rq(int cpu, s
> return dest_cpu;
>
> /* No more Mr. Nice Guy. */
> - if (unlikely(dest_cpu >= nr_cpu_ids)) {
> - dest_cpu = cpuset_cpus_allowed_fallback(p);
> - /*
> - * Don't tell them about moving exiting tasks or
> - * kernel threads (both mm NULL), since they never
> - * leave kernel.
> - */
> - if (p->mm && printk_ratelimit()) {
> - printk(KERN_INFO "process %d (%s) no "
> - "longer affine to cpu%d\n",
> - task_pid_nr(p), p->comm, cpu);
> - }
> + dest_cpu = cpuset_cpus_allowed_fallback(p);
> + /*
> + * Don't tell them about moving exiting tasks or
> + * kernel threads (both mm NULL), since they never
> + * leave kernel.
> + */
> + if (p->mm && printk_ratelimit()) {
> + printk(KERN_INFO "process %d (%s) no longer affine to cpu%d\n",
> + task_pid_nr(p), p->comm, cpu);
> }

Hmm. I was really puzzled until I realized this is just cleanup,
we can't reach this point if dest_cpu < nr_cpu_ids.

> +static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p)
> {
> struct rq *rq = cpu_rq(dead_cpu);
> + int needs_cpu, uninitialized_var(dest_cpu);
>
> - /* Must be exiting, otherwise would be on tasklist. */
> - BUG_ON(!p->exit_state);
> -
> - /* Cannot have done final schedule yet: would have vanished. */
> - BUG_ON(p->state == TASK_DEAD);
> -
> - get_task_struct(p);
> + needs_cpu = (task_cpu(p) == dead_cpu) && (p->state != TASK_WAKING);
> + if (needs_cpu)
> + dest_cpu = select_fallback_rq(dead_cpu, p);
> + raw_spin_unlock(&rq->lock);

Probably we do not need any checks. This task was picked by
->pick_next_task(), it should have task_cpu(p) == dead_cpu ?

But. I think there is a problem. We should not migrate current task,
stop thread, which does the migrating. At least, sched_stoptask.c
doesn't implement ->enqueue_task() and we can never wake it up later
for kthread_stop().

Perhaps migrate_tasks() should do for_each_class() by hand to
ignore stop_sched_class. But then _cpu_down() should somewhow
ensure the stop thread on the dead CPU is already parked in
schedule().

> - case CPU_DYING_FROZEN:
> /* Update our root-domain */
> raw_spin_lock_irqsave(&rq->lock, flags);
> if (rq->rd) {
> BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
> set_rq_offline(rq);
> }
> + migrate_tasks(cpu);
> + BUG_ON(rq->nr_running != 0);
> raw_spin_unlock_irqrestore(&rq->lock, flags);

Probably we don't really need rq->lock. All cpus run stop threads.

I am not sure about rq->idle, perhaps it should be deactivated.
I don't think we should migrate it.


What I never understood is the meaning of play_dead/etc. If we
remove sched_idle_next(), who will do that logic? And how the
idle thread can call idle_task_exit() ?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/