Re: [PATCH] Minor scheduler fix to get rid of skipping in xmms

From: Andrew Theurer
Date: Thu Sep 11 2003 - 08:03:44 EST


On Thursday 11 September 2003 06:04, Nick Piggin wrote:
> Andrew Theurer wrote:
> >Robert Love <rml@xxxxxxxxx> wrote:
> >>> There are a _lot_ of scheduler changes in 2.6-mm, and who knows which
> >>> ones are an improvement, a detriment, and a noop?
> >>
> >>We know that sched-2.6.0-test2-mm2-A3.patch caused the regression, and
> >>we now that sched-CAN_MIGRATE_TASK-fix.patch mostly fixed it up.
> >>
> >>
> >>What we don't know is whether the thing which
> >>sched-CAN_MIGRATE_TASK-fix.patch
> >>fixed was the thing which sched-2.6.0-test2-mm2-A3.patch broke.
> >
> >Sorry for jumping into this late. I didn't even know the can_migrate
> > patch was being discussed, let alone in -mm :). And to be fair, this
> > really is Ingo's aggressive idle steal patch.
> >
> >Anyway, these patches are somewhat related. It would seem that A3's
> >shortening the tasks' run time would not only slow performance beacuse of
> >cache thrash, but could possibly break CAN_MIGRATE's cache warmth check,
> >right? That in turn would stop load balancing from working well, leading
> > to more idle time, which the CAN_MIGRATE patch sort of bypassed for idle
> > cpus.
>
> Yeah thats probably right. Good thinking.
>
> >I see Nick's balance patch as somewhat harmless, at least combined with A3
> >patch. However, one concern is that the "ping-pong" steal interval is not
> >really 200ms, but 200ms/(nr_cpus-1), which without A3, could show up as a
> >problem, especially on an 8 way box. In addition, I do think there's a
> >problem with num tasks we steal. It should not be imbalance/2, it should
> > be: max_load - (node_nr_running / num_cpus_node). If we steal any more
> > than this, which is quite possible with imbalance/2, then it's likely
> > this_cpu now has too many tasks, and some other cpu will steal again.
> > Using *imbalance/2 works fine on 2-way smp, but I'm pretty sure we "over
> > steal" tasks on 4 way and up. Anyway, I'm getting off topic here...
>
> IIRC max_load is supposed to be the number of tasks on the runqueue
> being stolen from, isn't it?

Yes, but I think I still got this wrong. Ideally, once we finish stealing,
the busiest runqueue should not have more than node_nr_runing/nr_cpus_node,
but more importantly, this_cpu should not have more than
node_nr_running/nr_cpus_node, so maybe it should be:

min(a,b) where
a = max_load - load_average How much we are over the load_average
b = load_average - this_load How much we are under the load_average
load_average = node_nr_runing / nr_cpus_node.
node_nr_running can be summed as we look for the busiest queue, so it should
not be too costly.
if min(a,b) is neagtive (this_cpu's runqueue length was greater than
load_average) we don't steal at all.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/