Regarding USER_HZ and the exposure of kernel jiffies in userspace
From: William Breathitt Gray
Date: Sat Aug 22 2015 - 10:01:17 EST
Hello,
I submitted a bug report a couple months ago regarding the exposure of
unscaled kernel jiffies in the /proc/timer_list file (see
http://bugzilla.kernel.org/show_bug.cgi?id=99401):
> I noticed that the âjiffiesâ line from the /proc/timer_list file has a
> value that is not scaled via the USER_HZ constant. Looking into the
> source code of the kernel/time/timer_list.c file, I found lines
> 189-190 to be the cause:
>
> SEQ_printf(m, "jiffies: %Lu\n",
> (unsigned long long)jiffies);
>
> The actual kernel jiffies are printed out directly without scaling. I
> was under the impression that all kernel jiffies should be scaled via
> USER_HZ -- e.g. through the jiffies_to_clock_t function provided by
> include/linux/jiffies.h -- before exposure in userspace.
There has been no response since, and the behavior is still present in
Linux version 4.1.6, so I suspect my understanding is faulty and the
exposure of unscaled kernel jiffies is in fact intentional behavior.
I would like to understand why this behavior is intentional, and correct
my faulty impression of the design. Here's my understanding so far,
please let me know where I go wrong:
The Linux kernel used to have HZ set at a constant 100 for all
architectures. As additional architecture support was added, the HZ
value became variable: e.g. Linux on one machine could have a HZ
value of 1000 while Linux on another machine could have a HZ value
of 100.
This possibility of a variable HZ value caused existing user code,
which had hardcoded an expectation of HZ set to 100, to break due to
the exposure in userspace of kernel jiffies which may have be based
on a HZ value that was not equal to 100.
To prevent the chaos that would occur from years of existing user
code hardcoding a constant HZ value of 100, a compromise was made:
any exposure of kernel jiffies to userspace should be scaled via a
new USER_HZ value -- thus preventing existing user code from
breaking on machines with a different HZ value, while still allowing
the kernel on those machines to have a HZ value different from the
historic 100 value.
I believe the error in my understanding is the assumption that _all_
instances of kernel jiffies exposure in userspace should be scaled; but
it appears that not all instances are. When are kernel jiffies meant to
be scaled via USER_HZ, and when are they not?
Thanks,
William Breathitt Gray
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/