Re: [RFC] [PATCH] timer: Added usleep[_range][_interruptable] timer

From: Andreas Mohr
Date: Wed Jun 23 2010 - 18:04:48 EST


> I think you need to do some more research on what your actually doing to
> the system. From what your showing us one could make a lot of different
> arguments as to what this change will actually do. You really need some
> sort of test that doesn't leave a lot of room for argument.

I think the underlying issue he's having is that the timer APIs are simply
unadapted, they're awkward to use.

>From a driver POV, there really isn't much that you'd really CARE ABOUT
when entering any delay.

All you care about is to get a reliable delay, with the following characteristics:
- requested delay value
- wakeup spread (do I need this with hawk-eye precision, or is it ok if
wakeup is in the next century)
- something else? (perhaps "I need a warm/cold cache"?)

Whether this is preemptable, yieldable, power-managementable or entirely switch-offable
is ENTIRELY FRIGGIN' UNIMPORTANT to a driver, in most cases - it DOES NOT CARE about it.
The driver tells the OS what kind of delay characteristics it needs,
and it's the _OSes_ job to always do the most of that, be that a correspondingly deep
power management idle mode or whatever (one could argue that it should even know
on its own whether a critical section has to be obeyed or not, i.e. whether it's
preemptable or not).

This is just what a _minimal_, perfectly _adapted_ function interface should be.
And I'm afraid the kernel is somewhat off in that regard (mdelay, msleep, udelay,
... OH MY), which likely is why such discussions come up.

And if someone then says "but udelay is a tiny optimized function which is much faster
than some generic interface which would first need to execute a half-dozen branches
to figure out what mode exactly to choose", I say "to hell with it",
let's do the precisely right thing as fast as possible and not the sometimes right thing
perfectly fast (not to mention that always entering via the same central function
might have additional icache benefits, too).

Whether a particular environment is able to support useful power management quantities
in ms, us or even ns should never be a driver's job to worry about,
it should simply pass its requirements to the kernel and that's it.
Such orders of magnitude easily change over time given hardware's progress -
a well-designed, minimal kernel interface however probably won't need to.

Frankly this is just my feeling, I don't have any precise insight into these APIs,
thus I might be way off as to the complications of this and be talking out of my a... :)

Andreas Mohr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/