[2.6.25-rc1] Strange regression with CONFIG_HZ_300=y

From: Carlos R. Mafra
Date: Mon Feb 11 2008 - 08:38:28 EST


I apologize in advance if I am crazy about this, but I noticed
a strange regression wrt 2.6.24 in cpufreq (I think) in 2.6.25-rc1, which
goes away if I revert the following commit:

commit bdc807871d58285737d50dc6163d0feb72cb0dc2
Author: H. Peter Anvin <hpa@xxxxxxxxx>
Date: Fri Feb 8 04:21:26 2008 -0800

avoid overflows in kernel/time.c

When the conversion factor between jiffies and milli- or microseconds is
not a single multiply or divide, as for the case of HZ == 300, we currently
do a multiply followed by a divide. The intervening result, however, is
subject to overflows, especially since the fraction is not simplified (for
HZ == 300, we multiply by 300 and divide by 1000).

This is exposed to the user when passing a large timeout to poll(), for
example.

This patch replaces the multiply-divide with a reciprocal multiplication on
32-bit platforms. When the input is an unsigned long, there is no portable
way to do this on 64-bit platforms there is no portable way to do this
since it requires a 128-bit intermediate result (which gcc does support on
64-bit platforms but may generate libgcc calls, e.g. on 64-bit s390), but
since the output is a 32-bit integer in the cases affected, just simplify
the multiply-divide (*3/10 instead of *300/1000).

The reciprocal multiply used can have off-by-one errors in the upper half
of the valid output range. This could be avoided at the expense of having
to deal with a potential 65-bit intermediate result. Since the intent is
to avoid overflow problems and most of the other time conversions are only
semiexact, the off-by-one errors were considered an acceptable tradeoff.

[...]
[more text follows]

The problem in vanilla 2.6.25-rc1 happens with CONFIG_HZ_300=y (and doesn't
with CONFIG_HZ_250=y or with the above commit reverted). The cpu frequency doesn't
change anymore regardless of the load, and it stays high (2.0 GHz or 1.2 GHz) even
when idle (I checked with 'top'), when the usual is to go to 800 Mhz when idle (I
always use the ondemand governor compiled in and as the default governor).

The laptop is a Vaio VGN-FZ240E, core 2 duo T7250 @ 2.0 GHz and the kernel is x86_64.

If someone needs more information about this I will be happy to provide.

Carlos R. Mafra



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/