Re: dynamic-hz

From: Nish Aravamudan
Date: Mon Dec 13 2004 - 22:57:28 EST


On Mon, 13 Dec 2004 03:25:21 -0800, Andrew Morton <akpm@xxxxxxxx> wrote:
> Andrea Arcangeli <andrea@xxxxxxx> wrote:
> >
> > The patch only does HZ at dynamic time. But of course it's absolutely
> > trivial to define it at compile time, it's probably a 3 liner on top of
> > my current patch ;). However personally I don't think the three liner
> > will worth the few seconds more spent configuring the kernel ;).
>
> We still have 1000-odd places which do things like
>
> schedule_timeout(HZ/10);

Yes, yes, we do :) I replaced far more than I ever thought I could...
There are a few issues I have with the remaining schedule_timeout()
calls which I think fit ok with this thread... I'd especially like
your input, Andrew, as you end up getting most of my patches from KJ.

Many drivers use

set_current_state(TASK_{UN,}INTERRUPTIBLE);
schedule_timeout(1); // or some other small value < 10

This may or may not hide a dependency on a particular HZ value. If the
code is somewhat old, perhaps the author intended the task to sleep
for 1 jiffy when HZ was equal to 100. That meants that they ended up
sleeping for 10 ms. If the code is new, the author intends that the
task sleeps for 1 ms (HZ==1000). The question is, what should the
replacement be?

If they really meant to use schedule_timeout(1) in the sense of
highest resolution delay possible (the latter above), then they
probably should just call schedule() directly. schedule_timeout(1)
simply sets up a timer to fire off after 1 jiffy & then calls
schedule() itself. The overhead of setting up a timer and the
execution of schedule() itself probably means that the timer will go
off in the middle of the schedule() call or very shortly thereafter (I
think). In which case, it makes more sense to use schedule()
directly...

If they meant to schedule a delay of 10ms, then msleep() should be
used in those cases. msleep() will also resolve the issues with 0-time
timeouts because of rounding, as it adds 1 to the converted parameter.

Obviously, changing more and more sleeps to msecs & secs will really
help make the changing of HZ more transparent. And specifying the time
in real time units just seems so much clearer to me.

What do people think?

-Nish
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/