Re: [PATCH] Avoid moving tasks when a schedule can be made.
From: Ingo Molnar
Date: Wed Feb 01 2006 - 11:10:30 EST
* Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
> Ingo Molnar wrote:
> >* Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
>
> >>If it were generated by some real workload that cares, then I would care.
> >
> >
> >well, you might not care, but i do. It's up to you what you care about,
> >but right now the scheduler policy is that we do care about latencies.
> >Yes, it's obviously all subject to common sense, and if something
> >triggers in a rare and extreme workload then any change related to it
> >has a _much_ higher barrier of acceptance than a common codepath. But
> >your blanket dismissal of this whole subject based on the rarity of the
> >workload is just plain wrong.
> >
>
> No, if you read what I'd been saying, I'm not dismissing the whole
> subject based on the workload. I'm saying that there is no point to
> include such a "fix" based on the numbers given by this workload (if
> there is a more meaningful one, then sure). Especially not while there
> are sources of equivalent latency.
firstly, you are ignoring the fact that Steve never submitted this for
actual inclusion. His very first email stated:
"I'm not convinced that this bail out is in the right location, but
it worked where it is. Comments are welcome."
so i'm not sure why you are still pounding upon his patch and suggesting
that any solution to this problem is to be limited to the -rt kernel and
suggesting that the mainline kernel should not care. Yes, the mainline
kernel does care. We might not apply anything resulting out of this
(because this is a tricky piece of code), but we do care very much.
Indifference really does not help.
secondly,
> It is really simple: I can find a code path in the kernel, and work
> out how to exploit it by increasing resource loading until it goes
> bang (another example, tasklist_lock).
we are busy fixing tasklist_lock latencies too. The point you are still
trying to make, that the scheduler should not be touched just because
there are other problem areas with unbound latencies, is still plain
illogical.
> But there are still places where interrupts can be held off for
> indefinite periods. I don't see why the scheduler lock is suddenly
> important [...]
the scheduler lock is obviously important because it's the most central
lock in existence.
> [...] I could have told you years ago what would happen if you trigger
> the load balancer with enough tasks.
i very well know what move_tasks() can do. There used to be other ways
to provoke unbound latencies in the scheduler - e.g. via pinned tasks,
for which we introduced the all_pinned hack. The all_pinned hack was
needed because the worst-case behavior was getting so bad on some larger
boxes under larger loads that it totally DoSed the system. So it's not
at all unprecedented for us to care about boundary behavior in the
scheduler, nor are we as shy about aborting load-balancing decisions as
you are suggesting, it just all depends on the circumstances and on the
quality of the patch.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/