Re: [2.6.16-mm1 patch] throttling tree patches

From: Peter Williams
Date: Fri Mar 24 2006 - 19:34:38 EST


Mike Galbraith wrote:
Greetings,

I've broken down my throttling tree into 6 patches, which I'll send as
replies to this start-point.

Patch 1/6

Ignore timewarps caused by SMP timestamp rounding. Also, don't stamp a
task with a computed timestamp, stamp with the already called clock.

Signed-off-by: Mike Galbraith <efault@xxxxxx>

--- linux-2.6.16-mm1/kernel/sched.c.org 2006-03-23 15:01:41.000000000 +0100
+++ linux-2.6.16-mm1/kernel/sched.c 2006-03-23 15:02:25.000000000 +0100
@@ -805,6 +805,16 @@
unsigned long long __sleep_time = now - p->timestamp;
unsigned long sleep_time;
+ /*
+ * On SMP systems, a task can go to sleep on one CPU and
+ * wake up on another. When this happens, the timestamp
+ * is rounded to the nearest tick,

Is this true? There's no rounding that I can see. An attempt is made to adjust the timestamp for the drift between time as seen from the two CPUs but there's no deliberate rounding involved. Of course, that's not to say that the adjustment is accurate as I'm not convinced that the difference between the run queues' timestamp_last_tick is a always totally accurate measure of the drift in their clocks (due to possible races -- I may be wrong).

Of course, that doesn't mean that this chunk of code isn't required just that the comment is misleading.

which can lead to now
+ * being less than p->timestamp for short sleeps. Ignore
+ * these, they're insignificant.
+ */
+ if (unlikely(now < p->timestamp))
+ __sleep_time = 0ULL;
+
if (batch_task(p))
sleep_time = 0;
else {
@@ -871,20 +881,20 @@
*/
static void activate_task(task_t *p, runqueue_t *rq, int local)
{
- unsigned long long now;
+ unsigned long long now, comp;
- now = sched_clock();
+ now = comp = sched_clock();
#ifdef CONFIG_SMP
if (!local) {
/* Compensate for drifting sched_clock */
runqueue_t *this_rq = this_rq();
- now = (now - this_rq->timestamp_last_tick)
+ comp = (now - this_rq->timestamp_last_tick)
+ rq->timestamp_last_tick;
}
#endif
if (!rt_task(p))
- p->prio = recalc_task_prio(p, now);
+ p->prio = recalc_task_prio(p, comp);
/*
* This checks to make sure it's not an uninterruptible task

I think that this will end up with p->timestamp being set with a time appropriate to the current task's CPU rather than its own.

Peter
--
Peter Williams pwil3058@xxxxxxxxxxxxxx

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/