Re: [RFC][PATCH 07/22] sched: SCHED_DEADLINE push and pull logic

From: Raistlin
Date: Sun Nov 14 2010 - 04:15:33 EST


On Fri, 2010-11-12 at 17:17 +0100, Peter Zijlstra wrote:
> On Fri, 2010-10-29 at 08:32 +0200, Raistlin wrote:
> > Add dynamic migrations to SCHED_DEADLINE, so that tasks can
> > be moved among CPUs when necessary. It is also possible to bind a
> > task to a (set of) CPU(s), thus restricting its capability of
> > migrating, or forbidding migrations at all.
> >
> > The very same approach used in sched_rt is utilised:
> > - -deadline tasks are kept into CPU-specific runqueues,
> > - -deadline tasks are migrated among runqueues to achieve the
> > following:
> > * on an M-CPU system the M earliest deadline ready tasks
> > are always running;
> > * affinity/cpusets settings of all the -deadline tasks is
> > always respected.
>
> I haven't fully digested the patch, I keep getting side-tracked and its
> a large patch..
>
BTW, I was thinking about your suggestion of adding a *debugging* knob
for achieving a "lock everything while I'm migrating" behaviour... :-)

Something like locking the root_domain during pushes and pulls won't
probably work, since both of them do a double_lock_balance, taking two
rq, which might race with this new "global" lock.
Something like we (CPU#1) hold rq1->lock, we take rd->lock, and then we
try to take rq2->lock. CPU#2 holds rq2->lock and try to take rd->lock.
Stuck! :-(
This should be possible if both CPU#1 and CPU#2 are into a push or a
pull which, on each one, involves some task on the other. Do you agree,
or I'm missing/mistaking something? :-)

Something we can probably do is locking the root_domain for
_each_and_every_ scheduling decision, having all the rq->locks nesting
inside our new root_domain->lock. This would emulate some sort of unique
global rq implementation, since also local decisions on a CPU will
affect all the others, as if they were sharing a single rq... But it's
going to be very slow on large machines (but I guess we can afford
that... It's debugging!), and will probably affect other scheduling
class.
I'm not sure we want the latter... But maybe it could be useful for
debugging others too (at least for FIFO/RR, it should be!).

Let me know what you think...

Thanks and Regards,
Dario

--
<<This happens because I choose it to happen!>> (Raistlin Majere)
----------------------------------------------------------------------
Dario Faggioli, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)

http://blog.linux.it/raistlin / raistlin@xxxxxxxxx /
dario.faggioli@xxxxxxxxxx

Attachment: signature.asc
Description: This is a digitally signed message part