Re: Too much error in __const_udelay() ?
From: Dominik Brodowski
Date: Sat Jun 05 2004 - 10:26:56 EST
Hi,
> However I've started to see some problems w/ 2.6 and USB on x440/x445s,
> both of which use the 100Mhz cyclone time source. Further digging has
> pointed to the fact that certain important udelay()s in the USB
> subsystem aren't actually waiting long enough.
Certain? AFAICS _no_ call to a delay routine actually passed a big enough
argument. Or am I missing something? Also, __ndelay seems to be affected
as well: it returns zero for 550 nsec even for the TSC variant in your
test.c.
> So I'm no math wiz. What's the proper fix here?
Below are three changes I'd like to discuss. I'll build a fresh kernel with
all three changes enabled + PM_TIMER soon.
Change 1:
Move the multiplication with HZ up into the mull instruction:
unsigned long __const_udelay(unsigned long xloops)
{
int d0;
__asm__("mull %0"
:"=d" (xloops), "=&a" (d0)
:"1" (xloops),"0" (LPJ * HZ));
return __delay(xloops);
}
1 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 99
1 usec: LPJ: 1500000 __udelay: 1000 vs my_udelay: 1499
2 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 199
2 usec: LPJ: 1500000 __udelay: 2000 vs my_udelay: 2999
5 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 499
5 usec: LPJ: 1500000 __udelay: 7000 vs my_udelay: 7498
10 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 999
10 usec: LPJ: 1500000 __udelay: 14000 vs my_udelay: 14996
20 usec: LPJ: 100000 __udelay: 1000 vs my_udelay: 1999
20 usec: LPJ: 1500000 __udelay: 29000 vs my_udelay: 29993
50 usec: LPJ: 100000 __udelay: 4000 vs my_udelay: 4998
50 usec: LPJ: 1500000 __udelay: 74000 vs my_udelay: 74983
100 usec: LPJ: 100000 __udelay: 9000 vs my_udelay: 9997
100 usec: LPJ: 1500000 __udelay: 149000 vs my_udelay: 149966
20000 usec: LPJ: 100000 __udelay: 1999000 vs my_udelay: 1999549
20000 usec: LPJ: 1500000 __udelay: 29993000 vs my_udelay: 29993243
Change 2:
Round up in __udelay. While it can be argued that some time is also
spent in the delay functions, it's better to spend _at least_ the specified
time sleeping, in my humble opinion.
- return __const_udelay2(usecs * 0x000010c6); /* 2**32 / 1000000 */
+ return __const_udelay2(usecs * 0x000010c7); /* 2**32 / 1000000 (rounded up)*/
1 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 100
1 usec: LPJ: 1500000 __udelay: 1000 vs my_udelay: 1500
2 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 200
2 usec: LPJ: 1500000 __udelay: 2000 vs my_udelay: 3000
5 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 500
5 usec: LPJ: 1500000 __udelay: 7000 vs my_udelay: 7500
10 usec: LPJ: 100000 __udelay: 0 vs my_udelay: 1000
10 usec: LPJ: 1500000 __udelay: 14000 vs my_udelay: 15000
20 usec: LPJ: 100000 __udelay: 1000 vs my_udelay: 2000
20 usec: LPJ: 1500000 __udelay: 29000 vs my_udelay: 30000
50 usec: LPJ: 100000 __udelay: 4000 vs my_udelay: 5000
50 usec: LPJ: 1500000 __udelay: 74000 vs my_udelay: 75000
100 usec: LPJ: 100000 __udelay: 9000 vs my_udelay: 10000
100 usec: LPJ: 1500000 __udelay: 149000 vs my_udelay: 150001
20000 usec: LPJ: 100000 __udelay: 1999000 vs my_udelay: 2000015
20000 usec: LPJ: 1500000 __udelay: 29993000 vs my_udelay: 30000228
Change 3:
Asserting at least 1 loop is spent: in really small ndelay() calls to
low-mhz timers, this might be better.
return __delay(xloops ? xloops : 1);
Before:
1 nsec: LPJ: 100000 __ndelay: 0 vs my_udelay: 0
2 nsec: LPJ: 100000 __ndelay: 0 vs my_udelay: 0
5 nsec: LPJ: 100000 __ndelay: 0 vs my_udelay: 0
10 nsec: LPJ: 100000 __ndelay: 0 vs my_udelay: 1
20 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 2
50 nsec: LPJ: 100000 __ndelay: 0 vs my_udelay: 5
After:
1 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 1
2 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 1
5 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 1
10 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 1
20 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 2
50 nsec: LPJ: 100000 __udelay: 0 vs my_udelay: 5
Dominik
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/