Re: Measured overhead of timer interrupts

Albert D. Cahalan (acahalan@cs.uml.edu)
Thu, 22 Jul 1999 23:16:45 -0400 (EDT)


Andi Kleen writes:
> On Thu, Jul 22, 1999 at 09:12:22PM +0200, Khimenko Victor wrote:
>> In <19990722171109.A1783@fred.muc.de> Andi Kleen (ak@muc.de) wrote:
>>> On Thu, Jul 22, 1999 at 05:02:14PM +0200, Ingo Molnar wrote:
>>>> On 22 Jul 1999, Andi Kleen wrote:

>>>>> It is a kernel bug, HZ is not exported. Actually two,
>>>>> because none of the /proc should have used it as units,
>>>>> but that cannot be fixed anymore.

One value alone is poor. We ought to have:

CLOCKS_HZ for the one ugly system call
PROC_HZ for general /proc output
PROCNET_HZ for /proc/net specifically
TRUE_HZ what the kernel really uses, even if not periodic

>> /proc interface are in permanent change. We should keep it consistent
>> in one stable kernel series but I see no reason to keep it consistent
>> between major kernel updates: quite a lot of things will be changed anyway.
>
> This is not really true. There are new columns or lines added to /proc
> files, but old ones are not changes, and correctly writen /proc parsers
> should not break (unfortunately there are lots of incorrectly writen
> proc parser around..)

I know this is not true. I wrote the new ps. (prepare for rant)
There is no way to write a parser that will survive all of:

1. new columns
2. new spelling for keywords or row names
3. new rows
4. apparent keyword columns that suddenly get whitespace

Together, problems 1 and 4 are hell.
Together, problems 2 and 3 are hell.

It is better to write a dumb, fast, and simple parser. You will need to
replace your parser no matter what, so it is better to have one that is
easy to replace.

This isn't supposed to be like human-generated code you know, and even
that generally comes with a well-defined syntax. Asking parsers to be
able to tolerate /proc changes is like asking a C compiler from 1980
to support Java. Who could have predicted the changes?

It is best to eliminate the parser, reduce in-kernel formatting code,
and reduce the number of system calls needed. The most reasonable API
would use one system call for each type of information, returning
several related values at once. For example, all of the time stamps
should be returned in one call. All of the UID values should be returned
together in a different call. (all of that 64-bit of course)

>>>> we do not want to export HZ, why should we? HZ has no meaning to anything
>>>> else than the kernel. If the kernel exports HZ-dependent values into
>>>> /proc, then that has to be fixed. (yes it might be painful in some cases)

What, make the Alpha port use 100 for /proc data?

>>>> HZ might even go away in future kernels - what if we start using
>>>> nonperiodic timer interrupts?

HZ still has a value. HZ is the smallest interval you can generate or
measure. HZ could be as high as the CPU core frequency.

>>> Actually, POSIX wants it (sysconf(_SC_CLK_TCK)). glibc currently returns
>>> the HZ value it was compiled with, but that is hardly satisfying.
>>
>> This means that GLiBC must be fixed...
>
> First the kernel needs to export the necessary information. glibc is
> hardly to blame here.

Feel free to grab the code from proc/sysinfo.c in the procps package.

BTW, glibc also does a bad guess for the number of processors. It tries
to parse /proc/cpuinfo, which is of course different on every architecture.
Sometimes glibc screws up, giving a value of zero. According to the standard,
a value of zero is _not_ an error code. Hey, Linux is the fastest OS on a
system without any processors at all!

>>> I agree that the time related sysctls/proc files should be fixed,
>>> but I'm afraid it is too late.
>>
>> /proc files are not a problem but sysctls... In which sysctls HZ is
>> used in such way that GLibC recompilation will not help ?

AFAIK,
sysctl() is for setting values
sysconf() is for getting values

> It has nothing to do with glibc. There are documented sysctls that use
> HZ as unit, and Linux has long ago left the innocent state where you
> can change documented interfaces without thinking and some migration
> strategy.

OK, here is one:

1. add proper sysconf() support
2. make HZ vary by processor config (200 for 486, 400 for 586...)
3. revert to a 100 HZ default for Linux 2.4.xx, then back for 2.5.xx

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/