[RFC v4 00/22] adapt clockevents frequencies to mono clock

From: Nicolai Stange
Date: Mon Aug 22 2016 - 19:39:37 EST


Previous v3 of this series can be found here:

http://lkml.kernel.org/r/20160713130017.8202-1-nicstange@xxxxxxxxx

First of all, apologies for sending such a huge series.
My intent is to give you a good idea of what would IMO be
necessary to fix the "known issues" I listed in v3.
In case you think that this many changes are in no way justified by
the final goal, namely to make the clockevent core NTP correction
aware and thus avoiding the too short timer interrupts with NOHZ_FULL,
please just drop me a note and I'll shut up.

The "known issues" mentioned in v3 were:

Nicolai Stange <nicstange@xxxxxxxxx> writes:
> - The patchset assumes that a clockevent device's ->mult is changed after
> registration only through calls to clockevents_update_freq().
> For a handful of non-x86 drivers this isn't the case.

Addressed by

[1/22] clocksource: sh_cmt: compute rate before registration again
[2/22] clocksource: sh_tmu: compute rate before registration again
[3/22] clocksource: em_sti: split clock prepare and enable steps
[4/22] clocksource: em_sti: compute rate before registration
[5/22] clocksource: h8300_timer8: don't reset rate in ->set_state_oneshot()


> - ->min_delta_ns and ->max_delta_ns vs ->mult_mono:
> In clockevents_program_event(), we had
> delta = min(delta, (int64_t) dev->max_delta_ns);
> delta = max(delta, (int64_t) dev->min_delta_ns);
> clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
> The dev->mult is replaced with the dynamically adjusted dev->mult_mono
> by this series. That's problematic since as I understand it, especially
> ->max_delta_ns is a hard limit preventing the clockevent devices counter
> to be programmed with values larger than its width allows for.
> If ->mult_mono happens to be only slightly larger than ->mult, the
> comparison of delta against the ->mult based ->max_delta_ns can pass
> although the final clc might actually be larger than allowed.
> I think what we really want to have at this place is a check of clc
> against the already present ->min_delta_ticks and ->max_delta_ticks.
>
> The problem with this approach is that many drivers (~40) initialize
> ->min_delta_ns and ->max_delta_ns (typically with clockevent_delta2ns())
> but not the ->*_delta_ticks members. My suggestion at this point would
> be to convert them.

This subseries

[7/22] many clockevent drivers: set ->min_delta_ticks and ->max_delta_ticks
[8/22] arch/s390/kernel/time: set ->min_delta_ticks and ->max_delta_ticks
[9/22] arch/x86/platform/uv/uv_time: set ->min_delta_ticks and
->max_delta_ticks
[10/22] arch/tile/kernel/time: set ->min_delta_ticks and ->max_delta_ticks
[11/22] clockevents: always initialize ->min_delta_ns and ->max_delta_ns
[12/22] many clockevent drivers: don't set ->min_delta_ns and ->max_delta_ns

converts all drivers to set ->*_delta_ticks rather than ->*_delta_ns.


> This makes ->max_delta_ns obsolete right away.
> ->min_delta_ns is still needed in order to set the ->next_event in
> clockevents_program_min_delta() though. My claim is that
> ->min_delta_ns can be safely replaced with 0 in
> clockevents_program_min_delta()

This subseries

[13/22] clockevents: check a programmed delta's bounds in terms of cycles
[14/22] clockevents: clockevents_program_event(): turn clc into unsigned long
[15/22] clockevents: clockevents_program_min_delta(): don't set ->next_event
[16/22] clockevents: use ->min_delta_ticks_adjusted to program minimum delta
[17/22] clockevents: min delta increment: calculate min_delta_ns from ticks
[18/22] timer_list: print_tickdevice(): calculate ->*_delta_ns dynamically
[19/22] clockevents: purge ->min_delta_ns and ->max_delta_ns

gets rid of the ->*_delta_ns alltogether.


The remaining patches, i.e. [20-22/22], correspond to the former
[1-3/3] from v3.


Tested on next-20160816. Applies to next-20160822.


In case that you want me to proceed with this, I'd really appreciate
some hints on how to send this in non-RFC mode, i.e. with everybody
CC'd. Split this into smaller series and send one after another?
Possibly spread out over several releases?


Changes to v3:
[20/22] ("clockevents: initial support for mono to raw time conversion")
- Following J. Stultz' suggestion, I renamed ->mult_mono to
->mult_adjusted
- J. Stultz pointed out that locking is necessary in the
timekeeping_get_mono_mult() helper. Added.
[21/22] ("clockevents: make setting of ->mult and ->mult_adjusted atomic")
- I got the locking wrong here: updates to the bc device should be
serialized as well. Extend the lock to the bc case.
- Adapt to the ->mult_mono => ->mult_adjusted renaming.
[22/22] ("timekeeping: inform clockevents about freq adjustments")
- Adapt to the ->mult_mono => ->mult_adjusted renaming.

Nicolai Stange (22):
clocksource: sh_cmt: compute rate before registration again
clocksource: sh_tmu: compute rate before registration again
clocksource: em_sti: split clock prepare and enable steps
clocksource: em_sti: compute rate before registration
clocksource: h8300_timer8: don't reset rate in ->set_state_oneshot()
clockevents: make clockevents_config() static
many clockevent drivers: set ->min_delta_ticks and ->max_delta_ticks
arch/s390/kernel/time: set ->min_delta_ticks and ->max_delta_ticks
arch/x86/platform/uv/uv_time: set ->min_delta_ticks and
->max_delta_ticks
arch/tile/kernel/time: set ->min_delta_ticks and ->max_delta_ticks
clockevents: always initialize ->min_delta_ns and ->max_delta_ns
many clockevent drivers: don't set ->min_delta_ns and ->max_delta_ns
clockevents: check a programmed delta's bounds in terms of cycles
clockevents: clockevents_program_event(): turn clc into unsigned long
clockevents: clockevents_program_min_delta(): don't set ->next_event
clockevents: use ->min_delta_ticks_adjusted to program minimum delta
clockevents: min delta increment: calculate min_delta_ns from ticks
timer_list: print_tickdevice(): calculate ->*_delta_ns dynamically
clockevents: purge ->min_delta_ns and ->max_delta_ns
clockevents: initial support for mono to raw time conversion
clockevents: make setting of ->mult and ->mult_adjusted atomic
timekeeping: inform clockevents about freq adjustments

arch/avr32/kernel/time.c | 4 +-
arch/blackfin/kernel/time-ts.c | 8 +-
arch/c6x/platforms/timer64.c | 4 +-
arch/hexagon/kernel/time.c | 4 +-
arch/m68k/coldfire/pit.c | 6 +-
arch/microblaze/kernel/timer.c | 6 +-
arch/mips/alchemy/common/time.c | 4 +-
arch/mips/jz4740/time.c | 4 +-
arch/mips/kernel/cevt-bcm1480.c | 4 +-
arch/mips/kernel/cevt-ds1287.c | 4 +-
arch/mips/kernel/cevt-gt641xx.c | 4 +-
arch/mips/kernel/cevt-sb1250.c | 4 +-
arch/mips/kernel/cevt-txx9.c | 5 +-
arch/mips/loongson32/common/time.c | 4 +-
arch/mips/loongson64/common/cs5536/cs5536_mfgpt.c | 4 +-
arch/mips/loongson64/loongson-3/hpet.c | 4 +-
arch/mips/ralink/cevt-rt3352.c | 4 +-
arch/mips/sgi-ip27/ip27-timer.c | 4 +-
arch/mn10300/kernel/cevt-mn10300.c | 4 +-
arch/powerpc/kernel/time.c | 6 +-
arch/s390/kernel/time.c | 4 +-
arch/score/kernel/time.c | 6 +-
arch/sparc/kernel/time_32.c | 4 +-
arch/sparc/kernel/time_64.c | 6 +-
arch/tile/kernel/time.c | 4 +-
arch/um/kernel/time.c | 4 +-
arch/unicore32/kernel/time.c | 6 +-
arch/x86/kernel/apic/apic.c | 12 +-
arch/x86/lguest/boot.c | 4 +-
arch/x86/platform/uv/uv_time.c | 6 +-
arch/x86/xen/time.c | 8 +-
drivers/clocksource/dw_apb_timer.c | 5 +-
drivers/clocksource/em_sti.c | 49 ++++---
drivers/clocksource/h8300_timer8.c | 8 --
drivers/clocksource/metag_generic.c | 4 +-
drivers/clocksource/numachip.c | 4 +-
drivers/clocksource/sh_cmt.c | 50 +++----
drivers/clocksource/sh_tmu.c | 26 ++--
drivers/clocksource/timer-atlas7.c | 4 +-
include/linux/clockchips.h | 17 ++-
kernel/time/clockevents.c | 153 ++++++++++++++++------
kernel/time/tick-broadcast-hrtimer.c | 2 -
kernel/time/tick-internal.h | 2 +
kernel/time/timekeeping.c | 17 +++
kernel/time/timer_list.c | 10 +-
45 files changed, 295 insertions(+), 211 deletions(-)

--
2.9.2