Re: [RFC] Improving udelay/ndelay on platforms where that is possible

From: Pavel Machek
Date: Thu Dec 07 2017 - 07:43:31 EST


> On Wed, Nov 15, 2017 at 01:51:54PM +0100, Marc Gonzalez wrote:
> > On 01/11/2017 20:38, Marc Gonzalez wrote:
> >
> > > OK, I'll just send my patch, and then crawl back under my rock.
> >
> > Linus,
> >
> > As promised, the patch is provided below. And as promised, I will
> > no longer bring this up on LKML.
> >
> > FWIW, I have checked that the computed value matches the expected
> > value for all HZ and delay_us, and for a few clock frequencies,
> > using the following program:
> >
> > $ cat delays.c
> > #include <stdio.h>
> > #define MEGA 1000000u
> > typedef unsigned int uint;
> > typedef unsigned long long u64;
> > #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
> >
> > static const uint HZ_tab[] = { 100, 250, 300, 1000 };
> >
> > static void check_cycle_count(uint freq, uint HZ, uint delay_us)
> > {
> > uint UDELAY_MULT = (2147 * HZ) + (483648 * HZ / MEGA);
> > uint lpj = DIV_ROUND_UP(freq, HZ);
> > uint computed = ((u64)lpj * delay_us * UDELAY_MULT >> 31) + 1;
> > uint expected = DIV_ROUND_UP((u64)delay_us * freq, MEGA);
> >
> > if (computed != expected)
> > printf("freq=%u HZ=%u delay_us=%u comp=%u exp=%u\n", freq, HZ, delay_us, computed, expected);
> > }
> >
> > int main(void)
> > {
> > uint idx, delay_us, freq;
> >
> > for (freq = 3*MEGA; freq <= 100*MEGA; freq += 3*MEGA)
> > for (idx = 0; idx < sizeof HZ_tab / sizeof *HZ_tab; ++idx)
> > for (delay_us = 1; delay_us <= 2000; ++delay_us)
> > check_cycle_count(freq, HZ_tab[idx], delay_us);
> >
> > return 0;
> > }
> >
> >
> >
> > -- >8 --
> > Subject: [PATCH] ARM: Tweak clock-based udelay implementation
> >
> > In 9f8197980d87a ("delay: Add explanation of udelay() inaccuracy")
> > Russell pointed out that loop-based delays may return early.
> >
> > On the arm platform, delays may be either loop-based or clock-based.
> >
> > This patch tweaks the clock-based implementation so that udelay(N)
> > is guaranteed to spin at least N microseconds.
>
> As I've already said, I don't want this, because it encourages people
> to use too-small delays in driver code, and if we merge it then you
> will look at your data sheet, decide it says "you need to wait 10us"
> and write in your driver "udelay(10)" which will break on the loops
> based delay.
>
> udelay() needs to offer a consistent interface so that drivers know
> what to expect no matter what the implementation is. Making one
> implementation conform to your ideas while leaving the other
> implementations with other expectations is a recipe for bugs.

udelay() needs to be consistent across platforms, and yes, udelay(10)
is expected to delay at least 10usec.

If that is not true on your platform, _fix your platform_. But it is
not valid to reject patches fixing other platforms, just because your
platform is broken.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html