Re: [RFC PATCH 2/3] sched: add yield_to function

From: Rik van Riel
Date: Wed Dec 08 2010 - 12:56:01 EST


On 12/03/2010 08:23 AM, Peter Zijlstra wrote:
On Thu, 2010-12-02 at 14:44 -0500, Rik van Riel wrote:
unsigned long clone_flags);
+
+#ifdef CONFIG_SCHED_HRTICK
+extern u64 slice_remain(struct task_struct *);
+extern void yield_to(struct task_struct *);
+#else
+static inline void yield_to(struct task_struct *p) yield()
+#endif

That does SCHED_HRTICK have to do with any of this?

Legacy from an old prototype this patch is based on.
I'll get rid of that.

+/**
+ * requeue_task - requeue a task which priority got changed by yield_to

priority doesn't seem the right word, you're not actually changing
anything related to p->*prio

True, I'll change the comment.

+ * @rq: the tasks's runqueue
+ * @p: the task in question
+ * Must be called with the runqueue lock held. Will cause the CPU to
+ * reschedule if p is now at the head of the runqueue.
+ */
+void requeue_task(struct rq *rq, struct task_struct *p)
+{
+ assert_spin_locked(&rq->lock);
+
+ if (!p->se.on_rq || task_running(rq, p) || task_has_rt_policy(p))
+ return;
+
+ dequeue_task(rq, p, 0);
+ enqueue_task(rq, p, 0);
+
+ resched_task(p);

I guess that wants to be something like check_preempt_curr()

Done. Thanks for pointing that out.

@@ -6797,6 +6817,36 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
return ret;
}

+#ifdef CONFIG_SCHED_HRTICK

Still wondering what all this has to do with SCHED_HRTICK..

+/*
+ * Yield the CPU, giving the remainder of our time slice to task p.
+ * Typically used to hand CPU time to another thread inside the same
+ * process, eg. when p holds a resource other threads are waiting for.
+ * Giving priority to p may help get that resource released sooner.
+ */
+void yield_to(struct task_struct *p)
+{
+ unsigned long flags;
+ struct sched_entity *se =&p->se;
+ struct rq *rq;
+ struct cfs_rq *cfs_rq;
+ u64 remain = slice_remain(current);
+
+ rq = task_rq_lock(p,&flags);
+ if (task_running(rq, p) || task_has_rt_policy(p))
+ goto out;

See, this all ain't nice, slice_remain() don't make no sense to be
called for !fair tasks.

Why not write:

if (curr->sched_class == p->sched_class&&
curr->sched_class->yield_to)
curr->sched_class->yield_to(curr, p);

or something, and then implement sched_class_fair::yield_to only,
leaving it a NOP for all other classes.

Done.

+ cfs_rq = cfs_rq_of(se);
+ se->vruntime -= remain;
+ if (se->vruntime< cfs_rq->min_vruntime)
+ se->vruntime = cfs_rq->min_vruntime;

Now here we have another problem, remain was measured in wall-time, and
then you go change a virtual time measure using that. These things are
related like:

vt = t/weight

So you're missing a weight factor somewhere.

Also, that check against min_vruntime doesn't really make much sense.

OK, how do I do this?

+ requeue_task(rq, p);

Just makes me wonder why you added requeue task to begin with.. why not
simply dequeue at the top of this function, and enqueue at the tail,
like all the rest does: see rt_mutex_setprio(), set_user_nice(),
sched_move_task().

Done.

+ out:
+ task_rq_unlock(rq,&flags);
+ yield();
+}
+EXPORT_SYMBOL(yield_to);

EXPORT_SYMBOL_GPL() pretty please, I really hate how kvm is a module and
needs to export hooks all over the core kernel :/

Done.

Right, so another approach might be to simply swap the vruntime between
curr and p.

Doesn't that run into the same scale issue you described
above?

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/