Re: [RFC -v2 PATCH 2/3] sched: add yield_to function

From: Mike Galbraith
Date: Mon Dec 20 2010 - 05:33:25 EST


On Mon, 2010-12-20 at 11:46 +0200, Avi Kivity wrote:
> On 12/20/2010 11:30 AM, Mike Galbraith wrote:
> > > >
> > > > Because preempting a perfect stranger is not courteous, all tasks have
> > > > to play nice.
> > >
> > > I don't want to preempt anybody, simply make the task run before me.
> >
> > I thought you wanted to get the target to the cpu asap? You just can't
> > have he runs before me cross cpu.
>
> You're right, of course. I'm fine with running in parallel. I'm fine
> with him running before or instead of me. I'm not fine with running
> while the other guy is waiting.

Goody, maybe we're headed down a productive path then.

> > > Further, this is a kernel internal API, so no need for these types of
> > > restrictions. If we expose it to userspace, sure.
> >
> > Doesn't matter whether it's kernel or not afaikt. If virtualization has
> > to coexist peacefully with other loads, it can't just say "my hints are
> > the only ones that count", and thus shred other loads throughput.
>
> What does that have to do with being in the same group or not? I want
> to maintain fairness (needed for pure virt workloads, one guest cannot
> dominate another), but I don't see how being in the same thread group is
> relevant.

My thought is that you can shred your own throughput, but not some other
concurrent load. I'll have to let that thought stew a bit though.

> Again, I don't want more than one entitlement. I want to move part of
> my entitlement to another task.

Folks can keep trying that, but IMO it's too broken to live.

> > > > > > use cfs_rq->next to pass the scheduler a HINT of what you would LIKE to
> > > > > > happen.
> > > > >
> > > > > Hint is fine, so long as the scheduler seriously considers it.
> > > >
> > > > It will take the hint if the target the target hasn't had too much cpu.
> > >
> > > Since I'm running and the target isn't, it's clear the scheduler thinks
> > > the target had more cpu than I did [73]. That's why I want to donate
> > > cpu time.
> >
> > That's not necessarily true, in fact, it's very often false. Last/next
> > buddy will allow a task to run ahead of leftmost so we don't always
> > blindly select leftmost and shred cache.
>
> Ok.
>
> > > >
> > > > What would you suggest? There is no global execution timeline, so if
> > > > you want to definitely run after this task, you're stuck with moving to
> > > > his timezone or moving him to yours. Well, you could sleep a while, but
> > > > we know how productive sleeping is.
> > >
> > > I don't know. The whole idea of donating runtime was predicated on CFS
> > > being completely fair. Now I find that (a) it isn't (b) donating
> > > runtimes between tasks on different cpus is problematic.
> >
> > True and true. However, would you _want_ the scheduler to hold runnable
> > tasks hostage, and thus let CPU go to waste in the name of perfect
> > fairness? Perfect is the enemy of good applies to that idea imho.
>
> Sorry, I don't see how it follows.

Let's just forget theoretical views, and concentrate on a forward path.

> > > Moving tasks between cpus is expensive and sometimes prohibited by
> > > pinning. I'd like to avoid it if possible, but it's better than nothing.
> >
> > Expensive in many ways, so let's try to not do that.
> >
> > So why do you need this other task to run before you do, even cross cpu?
> > If he's a lock holder, getting him to the cpu will give him a chance to
> > drop, no? Isn't that what you want to get done? Drop that lock so you
> > or someone else can get something other than spinning done?
>
> Correct. I don't want the other task to run before me, I just don't
> want to run before it.

OK, so what I gather is that if you can preempt another of your own
threads to get the target to cpu, that would be a good thing whether
he's on the same cpu as yield_to() caller or not. If the target is
sharing a cpu with you, that's even better. Correct?

Would a kick/hint option be useful?

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/