Re: RT task scheduling

From: Darren Hart
Date: Thu Apr 06 2006 - 13:24:51 EST


On Wednesday 05 April 2006 21:19, Peter Williams wrote:
> Darren Hart wrote:
> > My last mail specifically addresses preempt-rt, but I'd like to know
> > people's thoughts regarding this issue in the mainline kernel. Please
> > see my previous post "realtime-preempt scheduling - rt_overload behavior"
> > for a testcase that produces unpredictable scheduling results.
> >
> > Part of the issue here is to define what we consider "correct behavior"
> > for SCHED_FIFO realtime tasks. Do we (A) need to strive for "strict
> > realtime priority scheduling" where the NR_CPUS highest priority runnable
> > SCHED_FIFO tasks are _always_ running? Or do we (B) take the best effort
> > approach with an upper limit RT priority imbalances, where an imbalance
> > may occur (say at wakeup or exit) but will be remedied within 1 tick.
> > The smpnice patches improve load balancing, but don't provide (A).
> >
> > More details in the previous mail...
>
> I'm currently researching some ideas to improve smpnice that may help in
> this situation. The basic idea is that as well as trying to equally
> distribute the weighted load among the groups/queues we should also try
> to achieve equal "average load per task" for each group/queue. (As well
> as helping with problems such as yours, this will help to restore the
> "equal distribution of nr_running" amongst groups/queues aim that is
> implicit without smpnice due to the fact that load is just a smoothed
> version of nr_running.)

Can you elaborate on what you mean by "average load per task" ?

Also, since smpnice is (correct me if I am wrong) load_balancing, I don't
think it will prevent the problem from happening, but rather fix it when it
does. If we want to prevent it from happening, I think we need to do
something like the rt_overload code from the RT patchset.

>
> In find_busiest_group(), I think that load balancing in the case where
> *imbalance is greater than busiest_load_per_task will tend towards this
> result and also when *imbalance is less than busiest_load_per_task AND
> busiest_load_per_task is less than this_load_per_task. However, in the
> case where *imbalance is less than busiest_load_per_task AND
> busiest_load_per_task is greater than this_load_per_task this will not
> be the case as the amount of load moved from "busiest" to "this" will be
> less than or equal to busiest_load_per_task and this will actually
> increase the value of busiest_load_per_task. So, although it will
> achieve the aim of equally distributing the weighted load, it won't help
> the second aim of equal "average load per task" values for groups/queues.
>
> The obvious way to fix this problem is to alter the code so that more
> than busiest_load_per_task is moved from "busiest" to "this" in these
> cases while at the same time ensuring that the imbalance between their
> loads doesn't get any bigger. I'm working on a patch along these lines.
>
> Changes to find_idlest_group() and try_to_wake_up() taking into account
> the "average load per task" on the candidate queues/groups as well as
> their weighted loads may also help and I'll be looking at them as well.
> It's not immediately obvious to me how this can be done so any ideas
> would be welcome. It will likely involve taking the load weight of the
> waking task into account as well.
>
> Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/