Re: [RFC -v2 PATCH 2/3] sched: add yield_to function

From: Avi Kivity
Date: Mon Dec 20 2010 - 04:46:48 EST


On 12/20/2010 11:30 AM, Mike Galbraith wrote:
> >
> > Because preempting a perfect stranger is not courteous, all tasks have
> > to play nice.
>
> I don't want to preempt anybody, simply make the task run before me.

I thought you wanted to get the target to the cpu asap? You just can't
have he runs before me cross cpu.

You're right, of course. I'm fine with running in parallel. I'm fine with him running before or instead of me. I'm not fine with running while the other guy is waiting.

> Further, this is a kernel internal API, so no need for these types of
> restrictions. If we expose it to userspace, sure.

Doesn't matter whether it's kernel or not afaikt. If virtualization has
to coexist peacefully with other loads, it can't just say "my hints are
the only ones that count", and thus shred other loads throughput.

What does that have to do with being in the same group or not? I want to maintain fairness (needed for pure virt workloads, one guest cannot dominate another), but I don't see how being in the same thread group is relevant.

Again, I don't want more than one entitlement. I want to move part of my entitlement to another task.

> > > > use cfs_rq->next to pass the scheduler a HINT of what you would LIKE to
> > > > happen.
> > >
> > > Hint is fine, so long as the scheduler seriously considers it.
> >
> > It will take the hint if the target the target hasn't had too much cpu.
>
> Since I'm running and the target isn't, it's clear the scheduler thinks
> the target had more cpu than I did [73]. That's why I want to donate
> cpu time.

That's not necessarily true, in fact, it's very often false. Last/next
buddy will allow a task to run ahead of leftmost so we don't always
blindly select leftmost and shred cache.

Ok.

> >
> > What would you suggest? There is no global execution timeline, so if
> > you want to definitely run after this task, you're stuck with moving to
> > his timezone or moving him to yours. Well, you could sleep a while, but
> > we know how productive sleeping is.
>
> I don't know. The whole idea of donating runtime was predicated on CFS
> being completely fair. Now I find that (a) it isn't (b) donating
> runtimes between tasks on different cpus is problematic.

True and true. However, would you _want_ the scheduler to hold runnable
tasks hostage, and thus let CPU go to waste in the name of perfect
fairness? Perfect is the enemy of good applies to that idea imho.

Sorry, I don't see how it follows.

> Moving tasks between cpus is expensive and sometimes prohibited by
> pinning. I'd like to avoid it if possible, but it's better than nothing.

Expensive in many ways, so let's try to not do that.

So why do you need this other task to run before you do, even cross cpu?
If he's a lock holder, getting him to the cpu will give him a chance to
drop, no? Isn't that what you want to get done? Drop that lock so you
or someone else can get something other than spinning done?

Correct. I don't want the other task to run before me, I just don't want to run before it.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/