Re: dynamic-hz

From: Nish Aravamudan
Date: Tue Dec 14 2004 - 00:27:52 EST


On Mon, 13 Dec 2004 20:29:39 -0800, Andrew Morton <akpm@xxxxxxxx> wrote:
> Nish Aravamudan <nish.aravamudan@xxxxxxxxx> wrote:
>
>
> >
> > On Mon, 13 Dec 2004 03:25:21 -0800, Andrew Morton <akpm@xxxxxxxx> wrote:
> > > Andrea Arcangeli <andrea@xxxxxxx> wrote:
> > > >
> > > > The patch only does HZ at dynamic time. But of course it's absolutely
> > > > trivial to define it at compile time, it's probably a 3 liner on top of
> > > > my current patch ;). However personally I don't think the three liner
> > > > will worth the few seconds more spent configuring the kernel ;).
> > >
> > > We still have 1000-odd places which do things like
> > >
> > > schedule_timeout(HZ/10);
> >
> > Yes, yes, we do :) I replaced far more than I ever thought I could...
> > There are a few issues I have with the remaining schedule_timeout()
> > calls which I think fit ok with this thread... I'd especially like
> > your input, Andrew, as you end up getting most of my patches from KJ.
> >
> > Many drivers use
> >
> > set_current_state(TASK_{UN,}INTERRUPTIBLE);
> > schedule_timeout(1); // or some other small value < 10
> >
> > This may or may not hide a dependency on a particular HZ value. If the
> > code is somewhat old, perhaps the author intended the task to sleep
> > for 1 jiffy when HZ was equal to 100. That meants that they ended up
> > sleeping for 10 ms. If the code is new, the author intends that the
> > task sleeps for 1 ms (HZ==1000). The question is, what should the
> > replacement be?
>
> Presumably they meant 10 milliseconds. Or at least, that is the delay
> which the developer did his testing with.

OK, I will make a set of these changes soon, hopefully.

> > If they really meant to use schedule_timeout(1) in the sense of
> > highest resolution delay possible (the latter above), then they
> > probably should just call schedule() directly.
>
> argh. Never do that. It's basically a busywait and can cause lockups if
> the calling task has realtime scheduling policy.

OK, I won't make any such changes in my next next set of patches.

> > schedule_timeout(1)
> > simply sets up a timer to fire off after 1 jiffy & then calls
> > schedule() itself. The overhead of setting up a timer and the
> > execution of schedule() itself probably means that the timer will go
> > off in the middle of the schedule() call or very shortly thereafter (I
> > think). In which case, it makes more sense to use schedule()
> > directly...
> >
> > If they meant to schedule a delay of 10ms, then msleep() should be
> > used in those cases. msleep() will also resolve the issues with 0-time
> > timeouts because of rounding, as it adds 1 to the converted parameter.
> >
> > Obviously, changing more and more sleeps to msecs & secs will really
> > help make the changing of HZ more transparent. And specifying the time
> > in real time units just seems so much clearer to me.
> >
> > What do people think?
>
> I'd say that replacing them with msleep(10) is the safest approach.
> Depending on what the surronding code is actually doing, of course.

Thanks for the info!

-Nish
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/