Re: dynamic-hz

From: Andrew Morton
Date: Mon Dec 13 2004 - 23:35:28 EST


Nish Aravamudan <nish.aravamudan@xxxxxxxxx> wrote:
>
> On Mon, 13 Dec 2004 03:25:21 -0800, Andrew Morton <akpm@xxxxxxxx> wrote:
> > Andrea Arcangeli <andrea@xxxxxxx> wrote:
> > >
> > > The patch only does HZ at dynamic time. But of course it's absolutely
> > > trivial to define it at compile time, it's probably a 3 liner on top of
> > > my current patch ;). However personally I don't think the three liner
> > > will worth the few seconds more spent configuring the kernel ;).
> >
> > We still have 1000-odd places which do things like
> >
> > schedule_timeout(HZ/10);
>
> Yes, yes, we do :) I replaced far more than I ever thought I could...
> There are a few issues I have with the remaining schedule_timeout()
> calls which I think fit ok with this thread... I'd especially like
> your input, Andrew, as you end up getting most of my patches from KJ.
>
> Many drivers use
>
> set_current_state(TASK_{UN,}INTERRUPTIBLE);
> schedule_timeout(1); // or some other small value < 10
>
> This may or may not hide a dependency on a particular HZ value. If the
> code is somewhat old, perhaps the author intended the task to sleep
> for 1 jiffy when HZ was equal to 100. That meants that they ended up
> sleeping for 10 ms. If the code is new, the author intends that the
> task sleeps for 1 ms (HZ==1000). The question is, what should the
> replacement be?

Presumably they meant 10 milliseconds. Or at least, that is the delay
which the developer did his testing with.

> If they really meant to use schedule_timeout(1) in the sense of
> highest resolution delay possible (the latter above), then they
> probably should just call schedule() directly.

argh. Never do that. It's basically a busywait and can cause lockups if
the calling task has realtime scheduling policy.

> schedule_timeout(1)
> simply sets up a timer to fire off after 1 jiffy & then calls
> schedule() itself. The overhead of setting up a timer and the
> execution of schedule() itself probably means that the timer will go
> off in the middle of the schedule() call or very shortly thereafter (I
> think). In which case, it makes more sense to use schedule()
> directly...
>
> If they meant to schedule a delay of 10ms, then msleep() should be
> used in those cases. msleep() will also resolve the issues with 0-time
> timeouts because of rounding, as it adds 1 to the converted parameter.
>
> Obviously, changing more and more sleeps to msecs & secs will really
> help make the changing of HZ more transparent. And specifying the time
> in real time units just seems so much clearer to me.
>
> What do people think?

I'd say that replacing them with msleep(10) is the safest approach.
Depending on what the surronding code is actually doing, of course.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/