Re: time slice cfq comments

From: Con Kolivas
Date: Sat Dec 11 2004 - 08:57:49 EST


Ingo Molnar wrote:
* Con Kolivas <kernel@xxxxxxxxxxx> wrote:


Hi Jens

Just thought I'd make a few comments about some of the code in your
time sliced cfq.


(this code was actually a quick hack from me.)

Heh I wondered why Jens was diddling with cpu scheduler code ;)

+ if (p->array)
+ return min(cpu_curr(task_cpu(p))->time_slice,
+ (unsigned int)MAX_SLEEP_AVG);

MAX_SLEEP_AVG is basically 10 * the average time_slice so this will
always return task_cpu(p)->time_slice as the min value (except for the
race you described in your comments). What you probably want is


the min() is there to not get ridiculous results due to the runqueue
race, nothing else. Basically i didnt want to lock the runqueue to do
something that is an estimation anyway, and rq->curr might be invalid. This was a proof-of-concept thing i wrote for Jens, if it works out then
i think we want to lock the runqueue nevertheless, to not dereference
possibly deallocated tasks (and to not trip up things like
DEBUG_PAGEALLOC).

I understood that. I just thought that DEF_TIMESLICE would be a better upper bound.

Further down you do:
+ /*
+ * for blocked tasks, return half of the average sleep time.
+ * (because this is the average sleep-time we'll see if we
+ * sample the period randomly.)
+ */
+ return NS_TO_JIFFIES(p->sleep_avg) / 2;

unfortunately p->sleep_avg is a non-linear value (weighted upwards towards MAX_SLEEP_AVG). I suspect here you want

+ return NS_TO_JIFFIES(p->sleep_avg) / MAX_BONUS;


sleep_avg might be nonlinear, but nevertheless it's an estimation of the
sleep time of a task. It's different if the task is interactive. We
cannot know how much the task really will sleep, what we want is a good
guess. I didnt want to complicate things too much, as long as the
ballpark figure is right. (i.e. as long as the function returns '0' for
on-runqueue tasks, returns a large value for long sleepers and returns
something inbetween for short/medium sleepers.) We can later on
complicate it with things like looking at p->timestamp to figure out how long it has been sleeping (and thus the ->sleep_avg is perhaps not authorative anymore), but i kept it simple & stupid for now.


I don't see any need for / 2.


the need for /2 is this: ->sleep_avg tells us the average _full_ sleep
period time (roughly). The CFQ IO-scheduler is sampling the task
_sometime_ during that period, randomly. So on average the task will
sleep another /2 of the sleep-average. Ok?

sleep_avg accumulates over time or can be gathered all within one sleep period so as well as being non-linear we have the situation of not knowing if it gradually accumulated or sleeps for > 1 second at a time. I still think it needs to be divided by the number of timeslices that fit into MAX_SLEEP_AVG, which by design is MAX_BONUS as the likely thing is it accumulates over time. Either way I think we'll be way out so it probably wont matter since this ends up being a weighting rather than an accurate measure.

I don't feel strongly about these values, I just originally thought it was Jens' interpretation of the values.

Cheers,
Con
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/