[GIT PULL] timer changes for v4.1

From: Ingo Molnar
Date: Mon Apr 13 2015 - 03:30:43 EST


Linus,

Please pull the latest timers-core-for-linus git tree from:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers-core-for-linus

# HEAD: def747087e83aa5f6a71582cfa71e18341988688 timers/PM: Drop unnecessary braces from tick_freeze()

The main changes in this cycle were:

- clockevents state machine cleanups and enhancements (Viresh Kumar)

- clockevents broadcast notifier horror to state machine conversion
and related cleanups. (Thomas Gleixner, Rafael J. Wysocki)

- clocksource and timekeeping core updates. (John Stultz)

- clocksource driver updates and fixes. (Ben Dooks, Dmitry Osipenko,
Hans de Goede, Laurent Pinchart, Maxime Ripard, Xunlei Pang)

- y2038 fixes. (Xunlei Pang, John Stultz)

- NMI-safe ktime_get_raw_fast() and general refactoring of the clock
code, in preparation to perf's per event clock ID support. (Peter
Zijlstra)

- generic sched/clock fixes, optimizations and cleanups (Daniel Thompson)

- clockevents cpu_down() race fix (Preeti U Murthy)

Thanks,

Ingo

------------------>
Ben Dooks (2):
clocksource/drivers/dw_apb_timers_of: Fix IO endianness causing time jumps
clocksource/drivers/at91: Fix IO endianness

Daniel Thompson (5):
timers, sched/clock: Match scope of read and write seqcounts
timers, sched/clock: Optimize cache line usage
timers, sched/clock: Remove suspend from clock_read_data()
timers, sched/clock: Remove redundant notrace from update function
timers, sched/clock: Avoid deadlock during read from NMI

Dmitry Osipenko (1):
clocksource/drivers/tegra: Fix IO endianness

Hans de Goede (1):
clocksource/drivers/sun4i-timer: Only register a sched_clock on sun4i and sun5i

Ingo Molnar (3):
timers, sched/clock: Clean up the code a bit
tick: Further simplify tick-internal.h
clockevents: Clean up clockchips.h

John Stultz (13):
clocksource: Simplify the clocks_calc_max_nsecs() logic
clocksource: Simplify the logic around clocksource wrapping safety margins
clocksource: Add 'max_cycles' to 'struct clocksource'
timekeeping: Add debugging checks to warn if we see delays
timekeeping: Add checks to cap clocksource reads to the 'max_cycles' value
timekeeping: Try to catch clocksource delta underflows
timekeeping: Add warnings when overflows or underflows are observed
clocksource: Improve clocksource watchdog reporting
clocksource: Mostly kill clocksource_register()
clocksource, sparc32: Convert to using clocksource_register_hz()
clocksource: Add some debug info about clocksources being registered
clocksource: Rename __clocksource_updatefreq_*() to __clocksource_update_freq_*()
clocksource: Improve comment explaining clocks_calc_max_nsecs()'s 50% safety margin

Laurent Pinchart (1):
clocksource/drivers/arm_arch_timer: Rename 'arch_timer_probed()' to 'arch_timer_needs_probing()' to reflect behaviour

Maxime Ripard (4):
clocksource/drivers/sun5i: Switch to request_irq()
clocksource/drivers/sun5i: Use of_io_request_and_map()
clocksource/drivers/sun5i: Refactor the current code
clocksource/drivers/sun5i: Add clock notifiers

Peter Zijlstra (7):
time: Rename timekeeper::tkr to timekeeper::tkr_mono
time: Add timerkeeper::tkr_raw
time: Parametrize all tk_fast_mono users
time: Introduce tk_fast_raw
time: Add ktime_get_tai_ns()
timer: Allocate per-cpu tvec_base's statically
timer: Further simplify the SMP and HOTPLUG logic

Preeti U Murthy (1):
clockevents: Fix cpu_down() race for hrtimer based broadcasting

Rafael J. Wysocki (3):
clockevents: Remove broadcast oneshot control leftovers
timers/PM: Fix up tick_unfreeze()
timers/PM: Drop unnecessary braces from tick_freeze()

Thomas Gleixner (29):
clockevents: Remove CONFIG_GENERIC_CLOCKEVENTS_BUILD
tick: Move clocksource related stuff to timekeeping.h
tick: Simplify tick-internal.h
tick: Move core only declarations and functions to core
clockevents: Remove extra local_irq_save() in clockevents_exchange_device()
clockevents: Make suspend/resume calls explicit
tick: Make tick_resume_broadcast_oneshot() static
tick/xen: Provide and use tick_suspend_local() and tick_resume_local()
arm/bL_switcher: Kill tick suspend hackery
ACPI/PAD: Remove the local APIC nonsense
clockevents: Provide explicit broadcast control functions
x86/amd/idle, clockevents: Use explicit broadcast control function
ACPI/PAD: Use explicit broadcast control function
ACPI/processor: Use explicit broadcast control function
cpuidle: Use explicit broadcast control function
intel_idle: Use explicit broadcast control function
ARM: OMAP: Use explicit broadcast control function
clockevents: Remove the broadcast control leftovers
clockevents: Provide explicit broadcast oneshot control functions
x86/amd/idle, clockevents: Use explicit broadcast oneshot control functions
ACPI/PAD: Use explicit broadcast oneshot control function
ACPI/idle: Use explicit broadcast control function
intel_idle: Use explicit broadcast oneshot control function
ARM: OMAP: Use explicit broadcast oneshot control function
ARM: Tegra: Use explicit broadcast oneshot control function
sched/idle: Use explicit broadcast oneshot control function
clockevents: Make tick handover explicit
clockevents: Cleanup dead cpu explicitely
timekeeping: Get rid of stale comment

Viresh Kumar (6):
clockevents: Introduce mode specific callbacks
clockevents: Handle tick device's resume separately
clockevents: Manage device's state separately for the core
clockevents: Don't validate dev->mode against CLOCK_EVT_MODE_UNUSED for new interface
clocksource/drivers/efm32: Use CLOCK_EVT_FEAT_PERIODIC for defining features
timer: Don't initialize 'tvec_base' on hotplug

Xunlei Pang (18):
time: Add y2038 safe read_boot_clock64()
time: Add y2038 safe read_persistent_clock64()
time: Add y2038 safe update_persistent_clock64()
ARM: OMAP: 32k counter: Provide y2038-safe omap_read_persistent_clock() replacement
clocksource/drivers/tegra: Provide y2038-safe tegra_read_persistent_clock() replacement
ARM, clocksource/drivers: Provide read_boot_clock64() and read_persistent_clock64() and use them
drivers/rtc: Provide y2038 safe rtc_class_ops.set_mmss() replacement
drivers/rtc/test: Update driver to address y2038/y2106 issues
drivers/rtc/ab3100: Update driver to address y2038/y2106 issues
drivers/rtc/mc13xxx: Update driver to address y2038/y2106 issues
drivers/rtc/mxc: Modify rtc_update_alarm() not to touch the alarm time
drivers/rtc/mxc: Convert get_alarm_or_time()/set_alarm_or_time() to use time64_t
drivers/rtc/mxc: Update driver to address y2038/y2106 issues
alpha, rtc: Change to use rtc_class_ops's set_mmss64()
time: Don't build timekeeping_inject_sleeptime64() if no one uses it
drivers/rtc: Remove redundant rtc_valid_tm() from rtc_resume()
time: Fix a bug in timekeeping_suspend() with no persistent clock
time, drivers/rtc: Don't bother with rtc_resume() for the nonstop clocksource


arch/alpha/kernel/rtc.c | 8 +-
arch/arm/common/bL_switcher.c | 16 +-
arch/arm/include/asm/mach/time.h | 3 +-
arch/arm/kernel/time.c | 6 +-
arch/arm/mach-omap2/cpuidle44xx.c | 10 +-
arch/arm/mach-tegra/cpuidle-tegra114.c | 6 +-
arch/arm/mach-tegra/cpuidle-tegra20.c | 10 +-
arch/arm/mach-tegra/cpuidle-tegra30.c | 10 +-
arch/arm/plat-omap/counter_32k.c | 20 +-
arch/arm64/kernel/vdso.c | 10 +-
arch/mips/lasat/sysctl.c | 4 +-
arch/s390/kernel/time.c | 20 +-
arch/sparc/kernel/time_32.c | 6 +-
arch/tile/kernel/time.c | 24 +-
arch/x86/kernel/process.c | 13 +-
arch/x86/kernel/vsyscall_gtod.c | 24 +-
arch/x86/kvm/x86.c | 14 +-
arch/x86/xen/suspend.c | 11 +-
drivers/acpi/acpi_pad.c | 29 +-
drivers/acpi/processor_idle.c | 20 +-
drivers/clocksource/arm_arch_timer.c | 12 +-
drivers/clocksource/dw_apb_timer_of.c | 2 +-
drivers/clocksource/em_sti.c | 2 +-
drivers/clocksource/sh_cmt.c | 2 +-
drivers/clocksource/sh_tmu.c | 2 +-
drivers/clocksource/sun4i_timer.c | 10 +-
drivers/clocksource/tegra20_timer.c | 19 +-
drivers/clocksource/time-efm32.c | 2 +-
drivers/clocksource/timer-atmel-pit.c | 4 +-
drivers/clocksource/timer-sun5i.c | 299 +++++++++++++++-----
drivers/cpuidle/driver.c | 23 +-
drivers/idle/intel_idle.c | 17 +-
drivers/rtc/class.c | 8 +-
drivers/rtc/interface.c | 8 +-
drivers/rtc/rtc-ab3100.c | 55 ++--
drivers/rtc/rtc-mc13xxx.c | 32 +--
drivers/rtc/rtc-mxc.c | 55 ++--
drivers/rtc/rtc-test.c | 19 +-
drivers/rtc/systohc.c | 7 +-
include/linux/clockchips.h | 154 ++++++-----
include/linux/clocksource.h | 25 +-
include/linux/rtc.h | 1 +
include/linux/tick.h | 190 +++++--------
include/linux/timekeeper_internal.h | 16 +-
include/linux/timekeeping.h | 18 +-
kernel/cpu.c | 5 +
kernel/sched/idle.c | 5 +-
kernel/time/Kconfig | 6 -
kernel/time/Makefile | 6 +-
kernel/time/clockevents.c | 229 +++++++++------
kernel/time/clocksource.c | 173 ++++++------
kernel/time/hrtimer.c | 9 +-
kernel/time/jiffies.c | 7 +-
kernel/time/ntp.c | 14 +-
kernel/time/sched_clock.c | 236 +++++++++++-----
kernel/time/tick-broadcast.c | 179 ++++++------
kernel/time/tick-common.c | 82 ++++--
kernel/time/tick-internal.h | 211 ++++++--------
kernel/time/tick-oneshot.c | 6 +-
kernel/time/tick-sched.c | 7 +-
kernel/time/tick-sched.h | 74 +++++
kernel/time/timekeeping.c | 490 ++++++++++++++++++++++-----------
kernel/time/timekeeping.h | 7 +
kernel/time/timer.c | 149 +++++-----
kernel/time/timer_list.c | 34 ++-
lib/Kconfig.debug | 13 +
66 files changed, 1849 insertions(+), 1339 deletions(-)
create mode 100644 kernel/time/tick-sched.h

diff --git a/arch/alpha/kernel/rtc.c b/arch/alpha/kernel/rtc.c
index c8d284d8521f..f535a3fd0f60 100644
--- a/arch/alpha/kernel/rtc.c
+++ b/arch/alpha/kernel/rtc.c
@@ -116,7 +116,7 @@ alpha_rtc_set_time(struct device *dev, struct rtc_time *tm)
}

static int
-alpha_rtc_set_mmss(struct device *dev, unsigned long nowtime)
+alpha_rtc_set_mmss(struct device *dev, time64_t nowtime)
{
int retval = 0;
int real_seconds, real_minutes, cmos_minutes;
@@ -211,7 +211,7 @@ alpha_rtc_ioctl(struct device *dev, unsigned int cmd, unsigned long arg)
static const struct rtc_class_ops alpha_rtc_ops = {
.read_time = alpha_rtc_read_time,
.set_time = alpha_rtc_set_time,
- .set_mmss = alpha_rtc_set_mmss,
+ .set_mmss64 = alpha_rtc_set_mmss,
.ioctl = alpha_rtc_ioctl,
};

@@ -276,7 +276,7 @@ do_remote_mmss(void *data)
}

static int
-remote_set_mmss(struct device *dev, unsigned long now)
+remote_set_mmss(struct device *dev, time64_t now)
{
union remote_data x;
if (smp_processor_id() != boot_cpuid) {
@@ -290,7 +290,7 @@ remote_set_mmss(struct device *dev, unsigned long now)
static const struct rtc_class_ops remote_rtc_ops = {
.read_time = remote_read_time,
.set_time = remote_set_time,
- .set_mmss = remote_set_mmss,
+ .set_mmss64 = remote_set_mmss,
.ioctl = alpha_rtc_ioctl,
};
#endif
diff --git a/arch/arm/common/bL_switcher.c b/arch/arm/common/bL_switcher.c
index 6eaddc47c43d..37dc0fe1093f 100644
--- a/arch/arm/common/bL_switcher.c
+++ b/arch/arm/common/bL_switcher.c
@@ -151,8 +151,6 @@ static int bL_switch_to(unsigned int new_cluster_id)
unsigned int mpidr, this_cpu, that_cpu;
unsigned int ob_mpidr, ob_cpu, ob_cluster, ib_mpidr, ib_cpu, ib_cluster;
struct completion inbound_alive;
- struct tick_device *tdev;
- enum clock_event_mode tdev_mode;
long volatile *handshake_ptr;
int ipi_nr, ret;

@@ -219,13 +217,7 @@ static int bL_switch_to(unsigned int new_cluster_id)
/* redirect GIC's SGIs to our counterpart */
gic_migrate_target(bL_gic_id[ib_cpu][ib_cluster]);

- tdev = tick_get_device(this_cpu);
- if (tdev && !cpumask_equal(tdev->evtdev->cpumask, cpumask_of(this_cpu)))
- tdev = NULL;
- if (tdev) {
- tdev_mode = tdev->evtdev->mode;
- clockevents_set_mode(tdev->evtdev, CLOCK_EVT_MODE_SHUTDOWN);
- }
+ tick_suspend_local();

ret = cpu_pm_enter();

@@ -251,11 +243,7 @@ static int bL_switch_to(unsigned int new_cluster_id)

ret = cpu_pm_exit();

- if (tdev) {
- clockevents_set_mode(tdev->evtdev, tdev_mode);
- clockevents_program_event(tdev->evtdev,
- tdev->evtdev->next_event, 1);
- }
+ tick_resume_local();

trace_cpu_migrate_finish(ktime_get_real_ns(), ib_mpidr);
local_fiq_enable();
diff --git a/arch/arm/include/asm/mach/time.h b/arch/arm/include/asm/mach/time.h
index 90c12e1e695c..0f79e4dec7f9 100644
--- a/arch/arm/include/asm/mach/time.h
+++ b/arch/arm/include/asm/mach/time.h
@@ -12,8 +12,7 @@

extern void timer_tick(void);

-struct timespec;
-typedef void (*clock_access_fn)(struct timespec *);
+typedef void (*clock_access_fn)(struct timespec64 *);
extern int register_persistent_clock(clock_access_fn read_boot,
clock_access_fn read_persistent);

diff --git a/arch/arm/kernel/time.c b/arch/arm/kernel/time.c
index 0cc7e58c47cc..a66e37e211a9 100644
--- a/arch/arm/kernel/time.c
+++ b/arch/arm/kernel/time.c
@@ -76,7 +76,7 @@ void timer_tick(void)
}
#endif

-static void dummy_clock_access(struct timespec *ts)
+static void dummy_clock_access(struct timespec64 *ts)
{
ts->tv_sec = 0;
ts->tv_nsec = 0;
@@ -85,12 +85,12 @@ static void dummy_clock_access(struct timespec *ts)
static clock_access_fn __read_persistent_clock = dummy_clock_access;
static clock_access_fn __read_boot_clock = dummy_clock_access;;

-void read_persistent_clock(struct timespec *ts)
+void read_persistent_clock64(struct timespec64 *ts)
{
__read_persistent_clock(ts);
}

-void read_boot_clock(struct timespec *ts)
+void read_boot_clock64(struct timespec64 *ts)
{
__read_boot_clock(ts);
}
diff --git a/arch/arm/mach-omap2/cpuidle44xx.c b/arch/arm/mach-omap2/cpuidle44xx.c
index 01e398a868bc..57d429830e09 100644
--- a/arch/arm/mach-omap2/cpuidle44xx.c
+++ b/arch/arm/mach-omap2/cpuidle44xx.c
@@ -14,7 +14,7 @@
#include <linux/cpuidle.h>
#include <linux/cpu_pm.h>
#include <linux/export.h>
-#include <linux/clockchips.h>
+#include <linux/tick.h>

#include <asm/cpuidle.h>
#include <asm/proc-fns.h>
@@ -84,7 +84,6 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
{
struct idle_statedata *cx = state_ptr + index;
u32 mpuss_can_lose_context = 0;
- int cpu_id = smp_processor_id();

/*
* CPU0 has to wait and stay ON until CPU1 is OFF state.
@@ -112,7 +111,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
mpuss_can_lose_context = (cx->mpu_state == PWRDM_POWER_RET) &&
(cx->mpu_logic_state == PWRDM_POWER_OFF);

- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu_id);
+ tick_broadcast_enter();

/*
* Call idle CPU PM enter notifier chain so that
@@ -169,7 +168,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
if (dev->cpu == 0 && mpuss_can_lose_context)
cpu_cluster_pm_exit();

- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu_id);
+ tick_broadcast_exit();

fail:
cpuidle_coupled_parallel_barrier(dev, &abort_barrier);
@@ -184,8 +183,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
*/
static void omap_setup_broadcast_timer(void *arg)
{
- int cpu = smp_processor_id();
- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ON, &cpu);
+ tick_broadcast_enable();
}

static struct cpuidle_driver omap4_idle_driver = {
diff --git a/arch/arm/mach-tegra/cpuidle-tegra114.c b/arch/arm/mach-tegra/cpuidle-tegra114.c
index f2b586d7b15d..155807fa6fdd 100644
--- a/arch/arm/mach-tegra/cpuidle-tegra114.c
+++ b/arch/arm/mach-tegra/cpuidle-tegra114.c
@@ -15,7 +15,7 @@
*/

#include <asm/firmware.h>
-#include <linux/clockchips.h>
+#include <linux/tick.h>
#include <linux/cpuidle.h>
#include <linux/cpu_pm.h>
#include <linux/kernel.h>
@@ -44,7 +44,7 @@ static int tegra114_idle_power_down(struct cpuidle_device *dev,
tegra_set_cpu_in_lp2();
cpu_pm_enter();

- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &dev->cpu);
+ tick_broadcast_enter();

call_firmware_op(prepare_idle);

@@ -52,7 +52,7 @@ static int tegra114_idle_power_down(struct cpuidle_device *dev,
if (call_firmware_op(do_idle, 0) == -ENOSYS)
cpu_suspend(0, tegra30_sleep_cpu_secondary_finish);

- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &dev->cpu);
+ tick_broadcast_exit();

cpu_pm_exit();
tegra_clear_cpu_in_lp2();
diff --git a/arch/arm/mach-tegra/cpuidle-tegra20.c b/arch/arm/mach-tegra/cpuidle-tegra20.c
index 4f25a7c7ca0f..48844ae6c3a1 100644
--- a/arch/arm/mach-tegra/cpuidle-tegra20.c
+++ b/arch/arm/mach-tegra/cpuidle-tegra20.c
@@ -20,7 +20,7 @@
*/

#include <linux/clk/tegra.h>
-#include <linux/clockchips.h>
+#include <linux/tick.h>
#include <linux/cpuidle.h>
#include <linux/cpu_pm.h>
#include <linux/kernel.h>
@@ -136,11 +136,11 @@ static bool tegra20_cpu_cluster_power_down(struct cpuidle_device *dev,
if (tegra20_reset_cpu_1() || !tegra_cpu_rail_off_ready())
return false;

- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &dev->cpu);
+ tick_broadcast_enter();

tegra_idle_lp2_last();

- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &dev->cpu);
+ tick_broadcast_exit();

if (cpu_online(1))
tegra20_wake_cpu1_from_reset();
@@ -153,13 +153,13 @@ static bool tegra20_idle_enter_lp2_cpu_1(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index)
{
- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &dev->cpu);
+ tick_broadcast_enter();

cpu_suspend(0, tegra20_sleep_cpu_secondary_finish);

tegra20_cpu_clear_resettable();

- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &dev->cpu);
+ tick_broadcast_exit();

return true;
}
diff --git a/arch/arm/mach-tegra/cpuidle-tegra30.c b/arch/arm/mach-tegra/cpuidle-tegra30.c
index f8815ed65d9d..84d809a3cba3 100644
--- a/arch/arm/mach-tegra/cpuidle-tegra30.c
+++ b/arch/arm/mach-tegra/cpuidle-tegra30.c
@@ -20,7 +20,7 @@
*/

#include <linux/clk/tegra.h>
-#include <linux/clockchips.h>
+#include <linux/tick.h>
#include <linux/cpuidle.h>
#include <linux/cpu_pm.h>
#include <linux/kernel.h>
@@ -76,11 +76,11 @@ static bool tegra30_cpu_cluster_power_down(struct cpuidle_device *dev,
return false;
}

- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &dev->cpu);
+ tick_broadcast_enter();

tegra_idle_lp2_last();

- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &dev->cpu);
+ tick_broadcast_exit();

return true;
}
@@ -90,13 +90,13 @@ static bool tegra30_cpu_core_power_down(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index)
{
- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &dev->cpu);
+ tick_broadcast_enter();

smp_wmb();

cpu_suspend(0, tegra30_sleep_cpu_secondary_finish);

- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &dev->cpu);
+ tick_broadcast_exit();

return true;
}
diff --git a/arch/arm/plat-omap/counter_32k.c b/arch/arm/plat-omap/counter_32k.c
index 61b4d705c267..2438b96004c1 100644
--- a/arch/arm/plat-omap/counter_32k.c
+++ b/arch/arm/plat-omap/counter_32k.c
@@ -44,24 +44,20 @@ static u64 notrace omap_32k_read_sched_clock(void)
}

/**
- * omap_read_persistent_clock - Return time from a persistent clock.
+ * omap_read_persistent_clock64 - Return time from a persistent clock.
*
* Reads the time from a source which isn't disabled during PM, the
* 32k sync timer. Convert the cycles elapsed since last read into
- * nsecs and adds to a monotonically increasing timespec.
+ * nsecs and adds to a monotonically increasing timespec64.
*/
-static struct timespec persistent_ts;
+static struct timespec64 persistent_ts;
static cycles_t cycles;
static unsigned int persistent_mult, persistent_shift;
-static DEFINE_SPINLOCK(read_persistent_clock_lock);

-static void omap_read_persistent_clock(struct timespec *ts)
+static void omap_read_persistent_clock64(struct timespec64 *ts)
{
unsigned long long nsecs;
cycles_t last_cycles;
- unsigned long flags;
-
- spin_lock_irqsave(&read_persistent_clock_lock, flags);

last_cycles = cycles;
cycles = sync32k_cnt_reg ? readl_relaxed(sync32k_cnt_reg) : 0;
@@ -69,11 +65,9 @@ static void omap_read_persistent_clock(struct timespec *ts)
nsecs = clocksource_cyc2ns(cycles - last_cycles,
persistent_mult, persistent_shift);

- timespec_add_ns(&persistent_ts, nsecs);
+ timespec64_add_ns(&persistent_ts, nsecs);

*ts = persistent_ts;
-
- spin_unlock_irqrestore(&read_persistent_clock_lock, flags);
}

/**
@@ -103,7 +97,7 @@ int __init omap_init_clocksource_32k(void __iomem *vbase)

/*
* 120000 rough estimate from the calculations in
- * __clocksource_updatefreq_scale.
+ * __clocksource_update_freq_scale.
*/
clocks_calc_mult_shift(&persistent_mult, &persistent_shift,
32768, NSEC_PER_SEC, 120000);
@@ -116,7 +110,7 @@ int __init omap_init_clocksource_32k(void __iomem *vbase)
}

sched_clock_register(omap_32k_read_sched_clock, 32, 32768);
- register_persistent_clock(NULL, omap_read_persistent_clock);
+ register_persistent_clock(NULL, omap_read_persistent_clock64);
pr_info("OMAP clocksource: 32k_counter at 32768 Hz\n");

return 0;
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index 32aeea083d93..ec37ab3f524f 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -200,7 +200,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
void update_vsyscall(struct timekeeper *tk)
{
struct timespec xtime_coarse;
- u32 use_syscall = strcmp(tk->tkr.clock->name, "arch_sys_counter");
+ u32 use_syscall = strcmp(tk->tkr_mono.clock->name, "arch_sys_counter");

++vdso_data->tb_seq_count;
smp_wmb();
@@ -213,11 +213,11 @@ void update_vsyscall(struct timekeeper *tk)
vdso_data->wtm_clock_nsec = tk->wall_to_monotonic.tv_nsec;

if (!use_syscall) {
- vdso_data->cs_cycle_last = tk->tkr.cycle_last;
+ vdso_data->cs_cycle_last = tk->tkr_mono.cycle_last;
vdso_data->xtime_clock_sec = tk->xtime_sec;
- vdso_data->xtime_clock_nsec = tk->tkr.xtime_nsec;
- vdso_data->cs_mult = tk->tkr.mult;
- vdso_data->cs_shift = tk->tkr.shift;
+ vdso_data->xtime_clock_nsec = tk->tkr_mono.xtime_nsec;
+ vdso_data->cs_mult = tk->tkr_mono.mult;
+ vdso_data->cs_shift = tk->tkr_mono.shift;
}

smp_wmb();
diff --git a/arch/mips/lasat/sysctl.c b/arch/mips/lasat/sysctl.c
index 3b7f65cc4218..cf9b4633257e 100644
--- a/arch/mips/lasat/sysctl.c
+++ b/arch/mips/lasat/sysctl.c
@@ -75,11 +75,11 @@ static int rtctmp;
int proc_dolasatrtc(struct ctl_table *table, int write,
void *buffer, size_t *lenp, loff_t *ppos)
{
- struct timespec ts;
+ struct timespec64 ts;
int r;

if (!write) {
- read_persistent_clock(&ts);
+ read_persistent_clock64(&ts);
rtctmp = ts.tv_sec;
/* check for time < 0 and set to 0 */
if (rtctmp < 0)
diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index 20660dddb2d6..170ddd2018b3 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -215,20 +215,20 @@ void update_vsyscall(struct timekeeper *tk)
{
u64 nsecps;

- if (tk->tkr.clock != &clocksource_tod)
+ if (tk->tkr_mono.clock != &clocksource_tod)
return;

/* Make userspace gettimeofday spin until we're done. */
++vdso_data->tb_update_count;
smp_wmb();
- vdso_data->xtime_tod_stamp = tk->tkr.cycle_last;
+ vdso_data->xtime_tod_stamp = tk->tkr_mono.cycle_last;
vdso_data->xtime_clock_sec = tk->xtime_sec;
- vdso_data->xtime_clock_nsec = tk->tkr.xtime_nsec;
+ vdso_data->xtime_clock_nsec = tk->tkr_mono.xtime_nsec;
vdso_data->wtom_clock_sec =
tk->xtime_sec + tk->wall_to_monotonic.tv_sec;
- vdso_data->wtom_clock_nsec = tk->tkr.xtime_nsec +
- + ((u64) tk->wall_to_monotonic.tv_nsec << tk->tkr.shift);
- nsecps = (u64) NSEC_PER_SEC << tk->tkr.shift;
+ vdso_data->wtom_clock_nsec = tk->tkr_mono.xtime_nsec +
+ + ((u64) tk->wall_to_monotonic.tv_nsec << tk->tkr_mono.shift);
+ nsecps = (u64) NSEC_PER_SEC << tk->tkr_mono.shift;
while (vdso_data->wtom_clock_nsec >= nsecps) {
vdso_data->wtom_clock_nsec -= nsecps;
vdso_data->wtom_clock_sec++;
@@ -236,7 +236,7 @@ void update_vsyscall(struct timekeeper *tk)

vdso_data->xtime_coarse_sec = tk->xtime_sec;
vdso_data->xtime_coarse_nsec =
- (long)(tk->tkr.xtime_nsec >> tk->tkr.shift);
+ (long)(tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift);
vdso_data->wtom_coarse_sec =
vdso_data->xtime_coarse_sec + tk->wall_to_monotonic.tv_sec;
vdso_data->wtom_coarse_nsec =
@@ -246,8 +246,8 @@ void update_vsyscall(struct timekeeper *tk)
vdso_data->wtom_coarse_sec++;
}

- vdso_data->tk_mult = tk->tkr.mult;
- vdso_data->tk_shift = tk->tkr.shift;
+ vdso_data->tk_mult = tk->tkr_mono.mult;
+ vdso_data->tk_shift = tk->tkr_mono.shift;
smp_wmb();
++vdso_data->tb_update_count;
}
@@ -283,7 +283,7 @@ void __init time_init(void)
if (register_external_irq(EXT_IRQ_TIMING_ALERT, timing_alert_interrupt))
panic("Couldn't request external interrupt 0x1406");

- if (clocksource_register(&clocksource_tod) != 0)
+ if (__clocksource_register(&clocksource_tod) != 0)
panic("Could not register TOD clock source");

/* Enable TOD clock interrupts on the boot cpu. */
diff --git a/arch/sparc/kernel/time_32.c b/arch/sparc/kernel/time_32.c
index 2f80d23a0a44..18147a5523d9 100644
--- a/arch/sparc/kernel/time_32.c
+++ b/arch/sparc/kernel/time_32.c
@@ -181,17 +181,13 @@ static struct clocksource timer_cs = {
.rating = 100,
.read = timer_cs_read,
.mask = CLOCKSOURCE_MASK(64),
- .shift = 2,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
};

static __init int setup_timer_cs(void)
{
timer_cs_enabled = 1;
- timer_cs.mult = clocksource_hz2mult(sparc_config.clock_rate,
- timer_cs.shift);
-
- return clocksource_register(&timer_cs);
+ return clocksource_register_hz(&timer_cs, sparc_config.clock_rate);
}

#ifdef CONFIG_SMP
diff --git a/arch/tile/kernel/time.c b/arch/tile/kernel/time.c
index d412b0856c0a..00178ecf9aea 100644
--- a/arch/tile/kernel/time.c
+++ b/arch/tile/kernel/time.c
@@ -257,34 +257,34 @@ void update_vsyscall_tz(void)

void update_vsyscall(struct timekeeper *tk)
{
- if (tk->tkr.clock != &cycle_counter_cs)
+ if (tk->tkr_mono.clock != &cycle_counter_cs)
return;

write_seqcount_begin(&vdso_data->tb_seq);

- vdso_data->cycle_last = tk->tkr.cycle_last;
- vdso_data->mask = tk->tkr.mask;
- vdso_data->mult = tk->tkr.mult;
- vdso_data->shift = tk->tkr.shift;
+ vdso_data->cycle_last = tk->tkr_mono.cycle_last;
+ vdso_data->mask = tk->tkr_mono.mask;
+ vdso_data->mult = tk->tkr_mono.mult;
+ vdso_data->shift = tk->tkr_mono.shift;

vdso_data->wall_time_sec = tk->xtime_sec;
- vdso_data->wall_time_snsec = tk->tkr.xtime_nsec;
+ vdso_data->wall_time_snsec = tk->tkr_mono.xtime_nsec;

vdso_data->monotonic_time_sec = tk->xtime_sec
+ tk->wall_to_monotonic.tv_sec;
- vdso_data->monotonic_time_snsec = tk->tkr.xtime_nsec
+ vdso_data->monotonic_time_snsec = tk->tkr_mono.xtime_nsec
+ ((u64)tk->wall_to_monotonic.tv_nsec
- << tk->tkr.shift);
+ << tk->tkr_mono.shift);
while (vdso_data->monotonic_time_snsec >=
- (((u64)NSEC_PER_SEC) << tk->tkr.shift)) {
+ (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift)) {
vdso_data->monotonic_time_snsec -=
- ((u64)NSEC_PER_SEC) << tk->tkr.shift;
+ ((u64)NSEC_PER_SEC) << tk->tkr_mono.shift;
vdso_data->monotonic_time_sec++;
}

vdso_data->wall_time_coarse_sec = tk->xtime_sec;
- vdso_data->wall_time_coarse_nsec = (long)(tk->tkr.xtime_nsec >>
- tk->tkr.shift);
+ vdso_data->wall_time_coarse_nsec = (long)(tk->tkr_mono.xtime_nsec >>
+ tk->tkr_mono.shift);

vdso_data->monotonic_time_coarse_sec =
vdso_data->wall_time_coarse_sec + tk->wall_to_monotonic.tv_sec;
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 046e2d620bbe..2f43c62fcd6d 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -9,7 +9,7 @@
#include <linux/sched.h>
#include <linux/module.h>
#include <linux/pm.h>
-#include <linux/clockchips.h>
+#include <linux/tick.h>
#include <linux/random.h>
#include <linux/user-return-notifier.h>
#include <linux/dmi.h>
@@ -377,14 +377,11 @@ static void amd_e400_idle(void)

if (!cpumask_test_cpu(cpu, amd_e400_c1e_mask)) {
cpumask_set_cpu(cpu, amd_e400_c1e_mask);
- /*
- * Force broadcast so ACPI can not interfere.
- */
- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_FORCE,
- &cpu);
+ /* Force broadcast so ACPI can not interfere. */
+ tick_broadcast_force();
pr_info("Switch to broadcast mode on CPU%d\n", cpu);
}
- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu);
+ tick_broadcast_enter();

default_idle();

@@ -393,7 +390,7 @@ static void amd_e400_idle(void)
* called with interrupts disabled.
*/
local_irq_disable();
- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu);
+ tick_broadcast_exit();
local_irq_enable();
} else
default_idle();
diff --git a/arch/x86/kernel/vsyscall_gtod.c b/arch/x86/kernel/vsyscall_gtod.c
index c7d791f32b98..51e330416995 100644
--- a/arch/x86/kernel/vsyscall_gtod.c
+++ b/arch/x86/kernel/vsyscall_gtod.c
@@ -31,30 +31,30 @@ void update_vsyscall(struct timekeeper *tk)
gtod_write_begin(vdata);

/* copy vsyscall data */
- vdata->vclock_mode = tk->tkr.clock->archdata.vclock_mode;
- vdata->cycle_last = tk->tkr.cycle_last;
- vdata->mask = tk->tkr.mask;
- vdata->mult = tk->tkr.mult;
- vdata->shift = tk->tkr.shift;
+ vdata->vclock_mode = tk->tkr_mono.clock->archdata.vclock_mode;
+ vdata->cycle_last = tk->tkr_mono.cycle_last;
+ vdata->mask = tk->tkr_mono.mask;
+ vdata->mult = tk->tkr_mono.mult;
+ vdata->shift = tk->tkr_mono.shift;

vdata->wall_time_sec = tk->xtime_sec;
- vdata->wall_time_snsec = tk->tkr.xtime_nsec;
+ vdata->wall_time_snsec = tk->tkr_mono.xtime_nsec;

vdata->monotonic_time_sec = tk->xtime_sec
+ tk->wall_to_monotonic.tv_sec;
- vdata->monotonic_time_snsec = tk->tkr.xtime_nsec
+ vdata->monotonic_time_snsec = tk->tkr_mono.xtime_nsec
+ ((u64)tk->wall_to_monotonic.tv_nsec
- << tk->tkr.shift);
+ << tk->tkr_mono.shift);
while (vdata->monotonic_time_snsec >=
- (((u64)NSEC_PER_SEC) << tk->tkr.shift)) {
+ (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift)) {
vdata->monotonic_time_snsec -=
- ((u64)NSEC_PER_SEC) << tk->tkr.shift;
+ ((u64)NSEC_PER_SEC) << tk->tkr_mono.shift;
vdata->monotonic_time_sec++;
}

vdata->wall_time_coarse_sec = tk->xtime_sec;
- vdata->wall_time_coarse_nsec = (long)(tk->tkr.xtime_nsec >>
- tk->tkr.shift);
+ vdata->wall_time_coarse_nsec = (long)(tk->tkr_mono.xtime_nsec >>
+ tk->tkr_mono.shift);

vdata->monotonic_time_coarse_sec =
vdata->wall_time_coarse_sec + tk->wall_to_monotonic.tv_sec;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 32bf19ef3115..0ee725f1896d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1070,19 +1070,19 @@ static void update_pvclock_gtod(struct timekeeper *tk)
struct pvclock_gtod_data *vdata = &pvclock_gtod_data;
u64 boot_ns;

- boot_ns = ktime_to_ns(ktime_add(tk->tkr.base_mono, tk->offs_boot));
+ boot_ns = ktime_to_ns(ktime_add(tk->tkr_mono.base, tk->offs_boot));

write_seqcount_begin(&vdata->seq);

/* copy pvclock gtod data */
- vdata->clock.vclock_mode = tk->tkr.clock->archdata.vclock_mode;
- vdata->clock.cycle_last = tk->tkr.cycle_last;
- vdata->clock.mask = tk->tkr.mask;
- vdata->clock.mult = tk->tkr.mult;
- vdata->clock.shift = tk->tkr.shift;
+ vdata->clock.vclock_mode = tk->tkr_mono.clock->archdata.vclock_mode;
+ vdata->clock.cycle_last = tk->tkr_mono.cycle_last;
+ vdata->clock.mask = tk->tkr_mono.mask;
+ vdata->clock.mult = tk->tkr_mono.mult;
+ vdata->clock.shift = tk->tkr_mono.shift;

vdata->boot_ns = boot_ns;
- vdata->nsec_base = tk->tkr.xtime_nsec;
+ vdata->nsec_base = tk->tkr_mono.xtime_nsec;

write_seqcount_end(&vdata->seq);
}
diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
index c4df9dbd63b7..d9497698645a 100644
--- a/arch/x86/xen/suspend.c
+++ b/arch/x86/xen/suspend.c
@@ -1,5 +1,5 @@
#include <linux/types.h>
-#include <linux/clockchips.h>
+#include <linux/tick.h>

#include <xen/interface/xen.h>
#include <xen/grant_table.h>
@@ -81,17 +81,14 @@ void xen_arch_post_suspend(int cancelled)

static void xen_vcpu_notify_restore(void *data)
{
- unsigned long reason = (unsigned long)data;
-
/* Boot processor notified via generic timekeeping_resume() */
- if ( smp_processor_id() == 0)
+ if (smp_processor_id() == 0)
return;

- clockevents_notify(reason, NULL);
+ tick_resume_local();
}

void xen_arch_resume(void)
{
- on_each_cpu(xen_vcpu_notify_restore,
- (void *)CLOCK_EVT_NOTIFY_RESUME, 1);
+ on_each_cpu(xen_vcpu_notify_restore, NULL, 1);
}
diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c
index c7b105c0e1d3..6bc9cbc01ad6 100644
--- a/drivers/acpi/acpi_pad.c
+++ b/drivers/acpi/acpi_pad.c
@@ -26,7 +26,7 @@
#include <linux/kthread.h>
#include <linux/freezer.h>
#include <linux/cpu.h>
-#include <linux/clockchips.h>
+#include <linux/tick.h>
#include <linux/slab.h>
#include <linux/acpi.h>
#include <asm/mwait.h>
@@ -41,8 +41,6 @@ static unsigned long power_saving_mwait_eax;

static unsigned char tsc_detected_unstable;
static unsigned char tsc_marked_unstable;
-static unsigned char lapic_detected_unstable;
-static unsigned char lapic_marked_unstable;

static void power_saving_mwait_init(void)
{
@@ -82,13 +80,10 @@ static void power_saving_mwait_init(void)
*/
if (!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
tsc_detected_unstable = 1;
- if (!boot_cpu_has(X86_FEATURE_ARAT))
- lapic_detected_unstable = 1;
break;
default:
- /* TSC & LAPIC could halt in idle */
+ /* TSC could halt in idle */
tsc_detected_unstable = 1;
- lapic_detected_unstable = 1;
}
#endif
}
@@ -155,7 +150,6 @@ static int power_saving_thread(void *data)
sched_setscheduler(current, SCHED_RR, &param);

while (!kthread_should_stop()) {
- int cpu;
unsigned long expire_time;

try_to_freeze();
@@ -177,28 +171,15 @@ static int power_saving_thread(void *data)
mark_tsc_unstable("TSC halts in idle");
tsc_marked_unstable = 1;
}
- if (lapic_detected_unstable && !lapic_marked_unstable) {
- int i;
- /* LAPIC could halt in idle, so notify users */
- for_each_online_cpu(i)
- clockevents_notify(
- CLOCK_EVT_NOTIFY_BROADCAST_ON,
- &i);
- lapic_marked_unstable = 1;
- }
local_irq_disable();
- cpu = smp_processor_id();
- if (lapic_marked_unstable)
- clockevents_notify(
- CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu);
+ tick_broadcast_enable();
+ tick_broadcast_enter();
stop_critical_timings();

mwait_idle_with_hints(power_saving_mwait_eax, 1);

start_critical_timings();
- if (lapic_marked_unstable)
- clockevents_notify(
- CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu);
+ tick_broadcast_exit();
local_irq_enable();

if (time_before(expire_time, jiffies)) {
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index c6bb9f1257c9..53355196c5c6 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -32,7 +32,7 @@
#include <linux/acpi.h>
#include <linux/dmi.h>
#include <linux/sched.h> /* need_resched() */
-#include <linux/clockchips.h>
+#include <linux/tick.h>
#include <linux/cpuidle.h>
#include <linux/syscore_ops.h>
#include <acpi/processor.h>
@@ -157,12 +157,11 @@ static void lapic_timer_check_state(int state, struct acpi_processor *pr,
static void __lapic_timer_propagate_broadcast(void *arg)
{
struct acpi_processor *pr = (struct acpi_processor *) arg;
- unsigned long reason;

- reason = pr->power.timer_broadcast_on_state < INT_MAX ?
- CLOCK_EVT_NOTIFY_BROADCAST_ON : CLOCK_EVT_NOTIFY_BROADCAST_OFF;
-
- clockevents_notify(reason, &pr->id);
+ if (pr->power.timer_broadcast_on_state < INT_MAX)
+ tick_broadcast_enable();
+ else
+ tick_broadcast_disable();
}

static void lapic_timer_propagate_broadcast(struct acpi_processor *pr)
@@ -179,11 +178,10 @@ static void lapic_timer_state_broadcast(struct acpi_processor *pr,
int state = cx - pr->power.states;

if (state >= pr->power.timer_broadcast_on_state) {
- unsigned long reason;
-
- reason = broadcast ? CLOCK_EVT_NOTIFY_BROADCAST_ENTER :
- CLOCK_EVT_NOTIFY_BROADCAST_EXIT;
- clockevents_notify(reason, &pr->id);
+ if (broadcast)
+ tick_broadcast_enter();
+ else
+ tick_broadcast_exit();
}
}

diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index a3025e7ae35f..266469691e58 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -661,17 +661,17 @@ static const struct of_device_id arch_timer_mem_of_match[] __initconst = {
};

static bool __init
-arch_timer_probed(int type, const struct of_device_id *matches)
+arch_timer_needs_probing(int type, const struct of_device_id *matches)
{
struct device_node *dn;
- bool probed = true;
+ bool needs_probing = false;

dn = of_find_matching_node(NULL, matches);
if (dn && of_device_is_available(dn) && !(arch_timers_present & type))
- probed = false;
+ needs_probing = true;
of_node_put(dn);

- return probed;
+ return needs_probing;
}

static void __init arch_timer_common_init(void)
@@ -680,9 +680,9 @@ static void __init arch_timer_common_init(void)

/* Wait until both nodes are probed if we have two timers */
if ((arch_timers_present & mask) != mask) {
- if (!arch_timer_probed(ARCH_MEM_TIMER, arch_timer_mem_of_match))
+ if (arch_timer_needs_probing(ARCH_MEM_TIMER, arch_timer_mem_of_match))
return;
- if (!arch_timer_probed(ARCH_CP15_TIMER, arch_timer_of_match))
+ if (arch_timer_needs_probing(ARCH_CP15_TIMER, arch_timer_of_match))
return;
}

diff --git a/drivers/clocksource/dw_apb_timer_of.c b/drivers/clocksource/dw_apb_timer_of.c
index d305fb089767..a19a3f619cc7 100644
--- a/drivers/clocksource/dw_apb_timer_of.c
+++ b/drivers/clocksource/dw_apb_timer_of.c
@@ -108,7 +108,7 @@ static void __init add_clocksource(struct device_node *source_timer)

static u64 notrace read_sched_clock(void)
{
- return ~__raw_readl(sched_io_base);
+ return ~readl_relaxed(sched_io_base);
}

static const struct of_device_id sptimer_ids[] __initconst = {
diff --git a/drivers/clocksource/em_sti.c b/drivers/clocksource/em_sti.c
index d0a7bd66b8b9..dc3c6ee04aaa 100644
--- a/drivers/clocksource/em_sti.c
+++ b/drivers/clocksource/em_sti.c
@@ -210,7 +210,7 @@ static int em_sti_clocksource_enable(struct clocksource *cs)

ret = em_sti_start(p, USER_CLOCKSOURCE);
if (!ret)
- __clocksource_updatefreq_hz(cs, p->rate);
+ __clocksource_update_freq_hz(cs, p->rate);
return ret;
}

diff --git a/drivers/clocksource/sh_cmt.c b/drivers/clocksource/sh_cmt.c
index 2bd13b53b727..b8ff3c64cc45 100644
--- a/drivers/clocksource/sh_cmt.c
+++ b/drivers/clocksource/sh_cmt.c
@@ -641,7 +641,7 @@ static int sh_cmt_clocksource_enable(struct clocksource *cs)

ret = sh_cmt_start(ch, FLAG_CLOCKSOURCE);
if (!ret) {
- __clocksource_updatefreq_hz(cs, ch->rate);
+ __clocksource_update_freq_hz(cs, ch->rate);
ch->cs_enabled = true;
}
return ret;
diff --git a/drivers/clocksource/sh_tmu.c b/drivers/clocksource/sh_tmu.c
index f150ca82bfaf..b6b8fa3cd211 100644
--- a/drivers/clocksource/sh_tmu.c
+++ b/drivers/clocksource/sh_tmu.c
@@ -272,7 +272,7 @@ static int sh_tmu_clocksource_enable(struct clocksource *cs)

ret = sh_tmu_enable(ch);
if (!ret) {
- __clocksource_updatefreq_hz(cs, ch->rate);
+ __clocksource_update_freq_hz(cs, ch->rate);
ch->cs_enabled = true;
}

diff --git a/drivers/clocksource/sun4i_timer.c b/drivers/clocksource/sun4i_timer.c
index f4a9c0058b4d..1928a8912584 100644
--- a/drivers/clocksource/sun4i_timer.c
+++ b/drivers/clocksource/sun4i_timer.c
@@ -170,7 +170,15 @@ static void __init sun4i_timer_init(struct device_node *node)
TIMER_CTL_CLK_SRC(TIMER_CTL_CLK_SRC_OSC24M),
timer_base + TIMER_CTL_REG(1));

- sched_clock_register(sun4i_timer_sched_read, 32, rate);
+ /*
+ * sched_clock_register does not have priorities, and on sun6i and
+ * later there is a better sched_clock registered by arm_arch_timer.c
+ */
+ if (of_machine_is_compatible("allwinner,sun4i-a10") ||
+ of_machine_is_compatible("allwinner,sun5i-a13") ||
+ of_machine_is_compatible("allwinner,sun5i-a10s"))
+ sched_clock_register(sun4i_timer_sched_read, 32, rate);
+
clocksource_mmio_init(timer_base + TIMER_CNTVAL_REG(1), node->name,
rate, 350, 32, clocksource_mmio_readl_down);

diff --git a/drivers/clocksource/tegra20_timer.c b/drivers/clocksource/tegra20_timer.c
index d2616ef16770..5a112d72fc2d 100644
--- a/drivers/clocksource/tegra20_timer.c
+++ b/drivers/clocksource/tegra20_timer.c
@@ -51,15 +51,15 @@
static void __iomem *timer_reg_base;
static void __iomem *rtc_base;

-static struct timespec persistent_ts;
+static struct timespec64 persistent_ts;
static u64 persistent_ms, last_persistent_ms;

static struct delay_timer tegra_delay_timer;

#define timer_writel(value, reg) \
- __raw_writel(value, timer_reg_base + (reg))
+ writel_relaxed(value, timer_reg_base + (reg))
#define timer_readl(reg) \
- __raw_readl(timer_reg_base + (reg))
+ readl_relaxed(timer_reg_base + (reg))

static int tegra_timer_set_next_event(unsigned long cycles,
struct clock_event_device *evt)
@@ -120,26 +120,25 @@ static u64 tegra_rtc_read_ms(void)
}

/*
- * tegra_read_persistent_clock - Return time from a persistent clock.
+ * tegra_read_persistent_clock64 - Return time from a persistent clock.
*
* Reads the time from a source which isn't disabled during PM, the
* 32k sync timer. Convert the cycles elapsed since last read into
- * nsecs and adds to a monotonically increasing timespec.
+ * nsecs and adds to a monotonically increasing timespec64.
* Care must be taken that this funciton is not called while the
* tegra_rtc driver could be executing to avoid race conditions
* on the RTC shadow register
*/
-static void tegra_read_persistent_clock(struct timespec *ts)
+static void tegra_read_persistent_clock64(struct timespec64 *ts)
{
u64 delta;
- struct timespec *tsp = &persistent_ts;

last_persistent_ms = persistent_ms;
persistent_ms = tegra_rtc_read_ms();
delta = persistent_ms - last_persistent_ms;

- timespec_add_ns(tsp, delta * NSEC_PER_MSEC);
- *ts = *tsp;
+ timespec64_add_ns(&persistent_ts, delta * NSEC_PER_MSEC);
+ *ts = persistent_ts;
}

static unsigned long tegra_delay_timer_read_counter_long(void)
@@ -252,7 +251,7 @@ static void __init tegra20_init_rtc(struct device_node *np)
else
clk_prepare_enable(clk);

- register_persistent_clock(NULL, tegra_read_persistent_clock);
+ register_persistent_clock(NULL, tegra_read_persistent_clock64);
}
CLOCKSOURCE_OF_DECLARE(tegra20_rtc, "nvidia,tegra20-rtc", tegra20_init_rtc);

diff --git a/drivers/clocksource/time-efm32.c b/drivers/clocksource/time-efm32.c
index ec57ba2bbd87..5b6e3d5644c9 100644
--- a/drivers/clocksource/time-efm32.c
+++ b/drivers/clocksource/time-efm32.c
@@ -111,7 +111,7 @@ static irqreturn_t efm32_clock_event_handler(int irq, void *dev_id)
static struct efm32_clock_event_ddata clock_event_ddata = {
.evtdev = {
.name = "efm32 clockevent",
- .features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_MODE_PERIODIC,
+ .features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_PERIODIC,
.set_mode = efm32_clock_event_set_mode,
.set_next_event = efm32_clock_event_set_next_event,
.rating = 200,
diff --git a/drivers/clocksource/timer-atmel-pit.c b/drivers/clocksource/timer-atmel-pit.c
index b5b4d4585c9a..c0304ff608b0 100644
--- a/drivers/clocksource/timer-atmel-pit.c
+++ b/drivers/clocksource/timer-atmel-pit.c
@@ -61,12 +61,12 @@ static inline struct pit_data *clkevt_to_pit_data(struct clock_event_device *clk

static inline unsigned int pit_read(void __iomem *base, unsigned int reg_offset)
{
- return __raw_readl(base + reg_offset);
+ return readl_relaxed(base + reg_offset);
}

static inline void pit_write(void __iomem *base, unsigned int reg_offset, unsigned long value)
{
- __raw_writel(value, base + reg_offset);
+ writel_relaxed(value, base + reg_offset);
}

/*
diff --git a/drivers/clocksource/timer-sun5i.c b/drivers/clocksource/timer-sun5i.c
index 58597fbcc046..28aa4b7bb602 100644
--- a/drivers/clocksource/timer-sun5i.c
+++ b/drivers/clocksource/timer-sun5i.c
@@ -17,6 +17,7 @@
#include <linux/irq.h>
#include <linux/irqreturn.h>
#include <linux/reset.h>
+#include <linux/slab.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_irq.h>
@@ -36,8 +37,31 @@

#define TIMER_SYNC_TICKS 3

-static void __iomem *timer_base;
-static u32 ticks_per_jiffy;
+struct sun5i_timer {
+ void __iomem *base;
+ struct clk *clk;
+ struct notifier_block clk_rate_cb;
+ u32 ticks_per_jiffy;
+};
+
+#define to_sun5i_timer(x) \
+ container_of(x, struct sun5i_timer, clk_rate_cb)
+
+struct sun5i_timer_clksrc {
+ struct sun5i_timer timer;
+ struct clocksource clksrc;
+};
+
+#define to_sun5i_timer_clksrc(x) \
+ container_of(x, struct sun5i_timer_clksrc, clksrc)
+
+struct sun5i_timer_clkevt {
+ struct sun5i_timer timer;
+ struct clock_event_device clkevt;
+};
+
+#define to_sun5i_timer_clkevt(x) \
+ container_of(x, struct sun5i_timer_clkevt, clkevt)

/*
* When we disable a timer, we need to wait at least for 2 cycles of
@@ -45,30 +69,30 @@ static u32 ticks_per_jiffy;
* that is already setup and runs at the same frequency than the other
* timers, and we never will be disabled.
*/
-static void sun5i_clkevt_sync(void)
+static void sun5i_clkevt_sync(struct sun5i_timer_clkevt *ce)
{
- u32 old = readl(timer_base + TIMER_CNTVAL_LO_REG(1));
+ u32 old = readl(ce->timer.base + TIMER_CNTVAL_LO_REG(1));

- while ((old - readl(timer_base + TIMER_CNTVAL_LO_REG(1))) < TIMER_SYNC_TICKS)
+ while ((old - readl(ce->timer.base + TIMER_CNTVAL_LO_REG(1))) < TIMER_SYNC_TICKS)
cpu_relax();
}

-static void sun5i_clkevt_time_stop(u8 timer)
+static void sun5i_clkevt_time_stop(struct sun5i_timer_clkevt *ce, u8 timer)
{
- u32 val = readl(timer_base + TIMER_CTL_REG(timer));
- writel(val & ~TIMER_CTL_ENABLE, timer_base + TIMER_CTL_REG(timer));
+ u32 val = readl(ce->timer.base + TIMER_CTL_REG(timer));
+ writel(val & ~TIMER_CTL_ENABLE, ce->timer.base + TIMER_CTL_REG(timer));

- sun5i_clkevt_sync();
+ sun5i_clkevt_sync(ce);
}

-static void sun5i_clkevt_time_setup(u8 timer, u32 delay)
+static void sun5i_clkevt_time_setup(struct sun5i_timer_clkevt *ce, u8 timer, u32 delay)
{
- writel(delay, timer_base + TIMER_INTVAL_LO_REG(timer));
+ writel(delay, ce->timer.base + TIMER_INTVAL_LO_REG(timer));
}

-static void sun5i_clkevt_time_start(u8 timer, bool periodic)
+static void sun5i_clkevt_time_start(struct sun5i_timer_clkevt *ce, u8 timer, bool periodic)
{
- u32 val = readl(timer_base + TIMER_CTL_REG(timer));
+ u32 val = readl(ce->timer.base + TIMER_CTL_REG(timer));

if (periodic)
val &= ~TIMER_CTL_ONESHOT;
@@ -76,75 +100,230 @@ static void sun5i_clkevt_time_start(u8 timer, bool periodic)
val |= TIMER_CTL_ONESHOT;

writel(val | TIMER_CTL_ENABLE | TIMER_CTL_RELOAD,
- timer_base + TIMER_CTL_REG(timer));
+ ce->timer.base + TIMER_CTL_REG(timer));
}

static void sun5i_clkevt_mode(enum clock_event_mode mode,
- struct clock_event_device *clk)
+ struct clock_event_device *clkevt)
{
+ struct sun5i_timer_clkevt *ce = to_sun5i_timer_clkevt(clkevt);
+
switch (mode) {
case CLOCK_EVT_MODE_PERIODIC:
- sun5i_clkevt_time_stop(0);
- sun5i_clkevt_time_setup(0, ticks_per_jiffy);
- sun5i_clkevt_time_start(0, true);
+ sun5i_clkevt_time_stop(ce, 0);
+ sun5i_clkevt_time_setup(ce, 0, ce->timer.ticks_per_jiffy);
+ sun5i_clkevt_time_start(ce, 0, true);
break;
case CLOCK_EVT_MODE_ONESHOT:
- sun5i_clkevt_time_stop(0);
- sun5i_clkevt_time_start(0, false);
+ sun5i_clkevt_time_stop(ce, 0);
+ sun5i_clkevt_time_start(ce, 0, false);
break;
case CLOCK_EVT_MODE_UNUSED:
case CLOCK_EVT_MODE_SHUTDOWN:
default:
- sun5i_clkevt_time_stop(0);
+ sun5i_clkevt_time_stop(ce, 0);
break;
}
}

static int sun5i_clkevt_next_event(unsigned long evt,
- struct clock_event_device *unused)
+ struct clock_event_device *clkevt)
{
- sun5i_clkevt_time_stop(0);
- sun5i_clkevt_time_setup(0, evt - TIMER_SYNC_TICKS);
- sun5i_clkevt_time_start(0, false);
+ struct sun5i_timer_clkevt *ce = to_sun5i_timer_clkevt(clkevt);
+
+ sun5i_clkevt_time_stop(ce, 0);
+ sun5i_clkevt_time_setup(ce, 0, evt - TIMER_SYNC_TICKS);
+ sun5i_clkevt_time_start(ce, 0, false);

return 0;
}

-static struct clock_event_device sun5i_clockevent = {
- .name = "sun5i_tick",
- .rating = 340,
- .features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT,
- .set_mode = sun5i_clkevt_mode,
- .set_next_event = sun5i_clkevt_next_event,
-};
-
-
static irqreturn_t sun5i_timer_interrupt(int irq, void *dev_id)
{
- struct clock_event_device *evt = (struct clock_event_device *)dev_id;
+ struct sun5i_timer_clkevt *ce = (struct sun5i_timer_clkevt *)dev_id;

- writel(0x1, timer_base + TIMER_IRQ_ST_REG);
- evt->event_handler(evt);
+ writel(0x1, ce->timer.base + TIMER_IRQ_ST_REG);
+ ce->clkevt.event_handler(&ce->clkevt);

return IRQ_HANDLED;
}

-static struct irqaction sun5i_timer_irq = {
- .name = "sun5i_timer0",
- .flags = IRQF_TIMER | IRQF_IRQPOLL,
- .handler = sun5i_timer_interrupt,
- .dev_id = &sun5i_clockevent,
-};
+static cycle_t sun5i_clksrc_read(struct clocksource *clksrc)
+{
+ struct sun5i_timer_clksrc *cs = to_sun5i_timer_clksrc(clksrc);
+
+ return ~readl(cs->timer.base + TIMER_CNTVAL_LO_REG(1));
+}
+
+static int sun5i_rate_cb_clksrc(struct notifier_block *nb,
+ unsigned long event, void *data)
+{
+ struct clk_notifier_data *ndata = data;
+ struct sun5i_timer *timer = to_sun5i_timer(nb);
+ struct sun5i_timer_clksrc *cs = container_of(timer, struct sun5i_timer_clksrc, timer);
+
+ switch (event) {
+ case PRE_RATE_CHANGE:
+ clocksource_unregister(&cs->clksrc);
+ break;
+
+ case POST_RATE_CHANGE:
+ clocksource_register_hz(&cs->clksrc, ndata->new_rate);
+ break;
+
+ default:
+ break;
+ }
+
+ return NOTIFY_DONE;
+}
+
+static int __init sun5i_setup_clocksource(struct device_node *node,
+ void __iomem *base,
+ struct clk *clk, int irq)
+{
+ struct sun5i_timer_clksrc *cs;
+ unsigned long rate;
+ int ret;
+
+ cs = kzalloc(sizeof(*cs), GFP_KERNEL);
+ if (!cs)
+ return -ENOMEM;
+
+ ret = clk_prepare_enable(clk);
+ if (ret) {
+ pr_err("Couldn't enable parent clock\n");
+ goto err_free;
+ }
+
+ rate = clk_get_rate(clk);
+
+ cs->timer.base = base;
+ cs->timer.clk = clk;
+ cs->timer.clk_rate_cb.notifier_call = sun5i_rate_cb_clksrc;
+ cs->timer.clk_rate_cb.next = NULL;
+
+ ret = clk_notifier_register(clk, &cs->timer.clk_rate_cb);
+ if (ret) {
+ pr_err("Unable to register clock notifier.\n");
+ goto err_disable_clk;
+ }
+
+ writel(~0, base + TIMER_INTVAL_LO_REG(1));
+ writel(TIMER_CTL_ENABLE | TIMER_CTL_RELOAD,
+ base + TIMER_CTL_REG(1));
+
+ cs->clksrc.name = node->name;
+ cs->clksrc.rating = 340;
+ cs->clksrc.read = sun5i_clksrc_read;
+ cs->clksrc.mask = CLOCKSOURCE_MASK(32);
+ cs->clksrc.flags = CLOCK_SOURCE_IS_CONTINUOUS;
+
+ ret = clocksource_register_hz(&cs->clksrc, rate);
+ if (ret) {
+ pr_err("Couldn't register clock source.\n");
+ goto err_remove_notifier;
+ }
+
+ return 0;
+
+err_remove_notifier:
+ clk_notifier_unregister(clk, &cs->timer.clk_rate_cb);
+err_disable_clk:
+ clk_disable_unprepare(clk);
+err_free:
+ kfree(cs);
+ return ret;
+}
+
+static int sun5i_rate_cb_clkevt(struct notifier_block *nb,
+ unsigned long event, void *data)
+{
+ struct clk_notifier_data *ndata = data;
+ struct sun5i_timer *timer = to_sun5i_timer(nb);
+ struct sun5i_timer_clkevt *ce = container_of(timer, struct sun5i_timer_clkevt, timer);
+
+ if (event == POST_RATE_CHANGE) {
+ clockevents_update_freq(&ce->clkevt, ndata->new_rate);
+ ce->timer.ticks_per_jiffy = DIV_ROUND_UP(ndata->new_rate, HZ);
+ }
+
+ return NOTIFY_DONE;
+}
+
+static int __init sun5i_setup_clockevent(struct device_node *node, void __iomem *base,
+ struct clk *clk, int irq)
+{
+ struct sun5i_timer_clkevt *ce;
+ unsigned long rate;
+ int ret;
+ u32 val;
+
+ ce = kzalloc(sizeof(*ce), GFP_KERNEL);
+ if (!ce)
+ return -ENOMEM;
+
+ ret = clk_prepare_enable(clk);
+ if (ret) {
+ pr_err("Couldn't enable parent clock\n");
+ goto err_free;
+ }
+
+ rate = clk_get_rate(clk);
+
+ ce->timer.base = base;
+ ce->timer.ticks_per_jiffy = DIV_ROUND_UP(rate, HZ);
+ ce->timer.clk = clk;
+ ce->timer.clk_rate_cb.notifier_call = sun5i_rate_cb_clkevt;
+ ce->timer.clk_rate_cb.next = NULL;
+
+ ret = clk_notifier_register(clk, &ce->timer.clk_rate_cb);
+ if (ret) {
+ pr_err("Unable to register clock notifier.\n");
+ goto err_disable_clk;
+ }
+
+ ce->clkevt.name = node->name;
+ ce->clkevt.features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT;
+ ce->clkevt.set_next_event = sun5i_clkevt_next_event;
+ ce->clkevt.set_mode = sun5i_clkevt_mode;
+ ce->clkevt.rating = 340;
+ ce->clkevt.irq = irq;
+ ce->clkevt.cpumask = cpu_possible_mask;
+
+ /* Enable timer0 interrupt */
+ val = readl(base + TIMER_IRQ_EN_REG);
+ writel(val | TIMER_IRQ_EN(0), base + TIMER_IRQ_EN_REG);
+
+ clockevents_config_and_register(&ce->clkevt, rate,
+ TIMER_SYNC_TICKS, 0xffffffff);
+
+ ret = request_irq(irq, sun5i_timer_interrupt, IRQF_TIMER | IRQF_IRQPOLL,
+ "sun5i_timer0", ce);
+ if (ret) {
+ pr_err("Unable to register interrupt\n");
+ goto err_remove_notifier;
+ }
+
+ return 0;
+
+err_remove_notifier:
+ clk_notifier_unregister(clk, &ce->timer.clk_rate_cb);
+err_disable_clk:
+ clk_disable_unprepare(clk);
+err_free:
+ kfree(ce);
+ return ret;
+}

static void __init sun5i_timer_init(struct device_node *node)
{
struct reset_control *rstc;
- unsigned long rate;
+ void __iomem *timer_base;
struct clk *clk;
- int ret, irq;
- u32 val;
+ int irq;

- timer_base = of_iomap(node, 0);
+ timer_base = of_io_request_and_map(node, 0, of_node_full_name(node));
if (!timer_base)
panic("Can't map registers");

@@ -155,35 +334,13 @@ static void __init sun5i_timer_init(struct device_node *node)
clk = of_clk_get(node, 0);
if (IS_ERR(clk))
panic("Can't get timer clock");
- clk_prepare_enable(clk);
- rate = clk_get_rate(clk);

rstc = of_reset_control_get(node, NULL);
if (!IS_ERR(rstc))
reset_control_deassert(rstc);

- writel(~0, timer_base + TIMER_INTVAL_LO_REG(1));
- writel(TIMER_CTL_ENABLE | TIMER_CTL_RELOAD,
- timer_base + TIMER_CTL_REG(1));
-
- clocksource_mmio_init(timer_base + TIMER_CNTVAL_LO_REG(1), node->name,
- rate, 340, 32, clocksource_mmio_readl_down);
-
- ticks_per_jiffy = DIV_ROUND_UP(rate, HZ);
-
- /* Enable timer0 interrupt */
- val = readl(timer_base + TIMER_IRQ_EN_REG);
- writel(val | TIMER_IRQ_EN(0), timer_base + TIMER_IRQ_EN_REG);
-
- sun5i_clockevent.cpumask = cpu_possible_mask;
- sun5i_clockevent.irq = irq;
-
- clockevents_config_and_register(&sun5i_clockevent, rate,
- TIMER_SYNC_TICKS, 0xffffffff);
-
- ret = setup_irq(irq, &sun5i_timer_irq);
- if (ret)
- pr_warn("failed to setup irq %d\n", irq);
+ sun5i_setup_clocksource(node, timer_base, clk, irq);
+ sun5i_setup_clockevent(node, timer_base, clk, irq);
}
CLOCKSOURCE_OF_DECLARE(sun5i_a13, "allwinner,sun5i-a13-hstimer",
sun5i_timer_init);
diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
index 2697e87d5b34..5db147859b90 100644
--- a/drivers/cpuidle/driver.c
+++ b/drivers/cpuidle/driver.c
@@ -13,7 +13,7 @@
#include <linux/sched.h>
#include <linux/cpuidle.h>
#include <linux/cpumask.h>
-#include <linux/clockchips.h>
+#include <linux/tick.h>

#include "cpuidle.h"

@@ -130,21 +130,20 @@ static inline void __cpuidle_unset_driver(struct cpuidle_driver *drv)
#endif

/**
- * cpuidle_setup_broadcast_timer - enable/disable the broadcast timer
+ * cpuidle_setup_broadcast_timer - enable/disable the broadcast timer on a cpu
* @arg: a void pointer used to match the SMP cross call API
*
- * @arg is used as a value of type 'long' with one of the two values:
- * - CLOCK_EVT_NOTIFY_BROADCAST_ON
- * - CLOCK_EVT_NOTIFY_BROADCAST_OFF
+ * If @arg is NULL broadcast is disabled otherwise enabled
*
- * Set the broadcast timer notification for the current CPU. This function
- * is executed per CPU by an SMP cross call. It not supposed to be called
- * directly.
+ * This function is executed per CPU by an SMP cross call. It's not
+ * supposed to be called directly.
*/
static void cpuidle_setup_broadcast_timer(void *arg)
{
- int cpu = smp_processor_id();
- clockevents_notify((long)(arg), &cpu);
+ if (arg)
+ tick_broadcast_enable();
+ else
+ tick_broadcast_disable();
}

/**
@@ -239,7 +238,7 @@ static int __cpuidle_register_driver(struct cpuidle_driver *drv)

if (drv->bctimer)
on_each_cpu_mask(drv->cpumask, cpuidle_setup_broadcast_timer,
- (void *)CLOCK_EVT_NOTIFY_BROADCAST_ON, 1);
+ (void *)1, 1);

poll_idle_init(drv);

@@ -263,7 +262,7 @@ static void __cpuidle_unregister_driver(struct cpuidle_driver *drv)
if (drv->bctimer) {
drv->bctimer = 0;
on_each_cpu_mask(drv->cpumask, cpuidle_setup_broadcast_timer,
- (void *)CLOCK_EVT_NOTIFY_BROADCAST_OFF, 1);
+ NULL, 1);
}

__cpuidle_unset_driver(drv);
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index b0e58522780d..5c979d0667a2 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -55,7 +55,7 @@

#include <linux/kernel.h>
#include <linux/cpuidle.h>
-#include <linux/clockchips.h>
+#include <linux/tick.h>
#include <trace/events/power.h>
#include <linux/sched.h>
#include <linux/notifier.h>
@@ -638,12 +638,12 @@ static int intel_idle(struct cpuidle_device *dev,
leave_mm(cpu);

if (!(lapic_timer_reliable_states & (1 << (cstate))))
- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu);
+ tick_broadcast_enter();

mwait_idle_with_hints(eax, ecx);

if (!(lapic_timer_reliable_states & (1 << (cstate))))
- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu);
+ tick_broadcast_exit();

return index;
}
@@ -665,13 +665,12 @@ static void intel_idle_freeze(struct cpuidle_device *dev,

static void __setup_broadcast_timer(void *arg)
{
- unsigned long reason = (unsigned long)arg;
- int cpu = smp_processor_id();
-
- reason = reason ?
- CLOCK_EVT_NOTIFY_BROADCAST_ON : CLOCK_EVT_NOTIFY_BROADCAST_OFF;
+ unsigned long on = (unsigned long)arg;

- clockevents_notify(reason, &cpu);
+ if (on)
+ tick_broadcast_enable();
+ else
+ tick_broadcast_disable();
}

static int cpu_hotplug_notify(struct notifier_block *n,
diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
index 472a5adc4642..c29ba7e14304 100644
--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -55,7 +55,7 @@ static int rtc_suspend(struct device *dev)
struct timespec64 delta, delta_delta;
int err;

- if (has_persistent_clock())
+ if (timekeeping_rtc_skipsuspend())
return 0;

if (strcmp(dev_name(&rtc->dev), CONFIG_RTC_HCTOSYS_DEVICE) != 0)
@@ -102,7 +102,7 @@ static int rtc_resume(struct device *dev)
struct timespec64 sleep_time;
int err;

- if (has_persistent_clock())
+ if (timekeeping_rtc_skipresume())
return 0;

rtc_hctosys_ret = -ENODEV;
@@ -117,10 +117,6 @@ static int rtc_resume(struct device *dev)
return 0;
}

- if (rtc_valid_tm(&tm) != 0) {
- pr_debug("%s: bogus resume time\n", dev_name(&rtc->dev));
- return 0;
- }
new_rtc.tv_sec = rtc_tm_to_time64(&tm);
new_rtc.tv_nsec = 0;

diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index 37215cf983e9..d43ee409a5f2 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -72,7 +72,11 @@ int rtc_set_time(struct rtc_device *rtc, struct rtc_time *tm)
err = -ENODEV;
else if (rtc->ops->set_time)
err = rtc->ops->set_time(rtc->dev.parent, tm);
- else if (rtc->ops->set_mmss) {
+ else if (rtc->ops->set_mmss64) {
+ time64_t secs64 = rtc_tm_to_time64(tm);
+
+ err = rtc->ops->set_mmss64(rtc->dev.parent, secs64);
+ } else if (rtc->ops->set_mmss) {
time64_t secs64 = rtc_tm_to_time64(tm);
err = rtc->ops->set_mmss(rtc->dev.parent, secs64);
} else
@@ -96,6 +100,8 @@ int rtc_set_mmss(struct rtc_device *rtc, unsigned long secs)

if (!rtc->ops)
err = -ENODEV;
+ else if (rtc->ops->set_mmss64)
+ err = rtc->ops->set_mmss64(rtc->dev.parent, secs);
else if (rtc->ops->set_mmss)
err = rtc->ops->set_mmss(rtc->dev.parent, secs);
else if (rtc->ops->read_time && rtc->ops->set_time) {
diff --git a/drivers/rtc/rtc-ab3100.c b/drivers/rtc/rtc-ab3100.c
index 1d0340fdb820..9b725c553058 100644
--- a/drivers/rtc/rtc-ab3100.c
+++ b/drivers/rtc/rtc-ab3100.c
@@ -43,21 +43,21 @@
/*
* RTC clock functions and device struct declaration
*/
-static int ab3100_rtc_set_mmss(struct device *dev, unsigned long secs)
+static int ab3100_rtc_set_mmss(struct device *dev, time64_t secs)
{
u8 regs[] = {AB3100_TI0, AB3100_TI1, AB3100_TI2,
AB3100_TI3, AB3100_TI4, AB3100_TI5};
unsigned char buf[6];
- u64 fat_time = (u64) secs * AB3100_RTC_CLOCK_RATE * 2;
+ u64 hw_counter = secs * AB3100_RTC_CLOCK_RATE * 2;
int err = 0;
int i;

- buf[0] = (fat_time) & 0xFF;
- buf[1] = (fat_time >> 8) & 0xFF;
- buf[2] = (fat_time >> 16) & 0xFF;
- buf[3] = (fat_time >> 24) & 0xFF;
- buf[4] = (fat_time >> 32) & 0xFF;
- buf[5] = (fat_time >> 40) & 0xFF;
+ buf[0] = (hw_counter) & 0xFF;
+ buf[1] = (hw_counter >> 8) & 0xFF;
+ buf[2] = (hw_counter >> 16) & 0xFF;
+ buf[3] = (hw_counter >> 24) & 0xFF;
+ buf[4] = (hw_counter >> 32) & 0xFF;
+ buf[5] = (hw_counter >> 40) & 0xFF;

for (i = 0; i < 6; i++) {
err = abx500_set_register_interruptible(dev, 0,
@@ -75,7 +75,7 @@ static int ab3100_rtc_set_mmss(struct device *dev, unsigned long secs)

static int ab3100_rtc_read_time(struct device *dev, struct rtc_time *tm)
{
- unsigned long time;
+ time64_t time;
u8 rtcval;
int err;

@@ -88,7 +88,7 @@ static int ab3100_rtc_read_time(struct device *dev, struct rtc_time *tm)
dev_info(dev, "clock not set (lost power)");
return -EINVAL;
} else {
- u64 fat_time;
+ u64 hw_counter;
u8 buf[6];

/* Read out time registers */
@@ -98,22 +98,21 @@ static int ab3100_rtc_read_time(struct device *dev, struct rtc_time *tm)
if (err != 0)
return err;

- fat_time = ((u64) buf[5] << 40) | ((u64) buf[4] << 32) |
+ hw_counter = ((u64) buf[5] << 40) | ((u64) buf[4] << 32) |
((u64) buf[3] << 24) | ((u64) buf[2] << 16) |
((u64) buf[1] << 8) | (u64) buf[0];
- time = (unsigned long) (fat_time /
- (u64) (AB3100_RTC_CLOCK_RATE * 2));
+ time = hw_counter / (u64) (AB3100_RTC_CLOCK_RATE * 2);
}

- rtc_time_to_tm(time, tm);
+ rtc_time64_to_tm(time, tm);

return rtc_valid_tm(tm);
}

static int ab3100_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *alarm)
{
- unsigned long time;
- u64 fat_time;
+ time64_t time;
+ u64 hw_counter;
u8 buf[6];
u8 rtcval;
int err;
@@ -134,11 +133,11 @@ static int ab3100_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *alarm)
AB3100_AL0, buf, 4);
if (err)
return err;
- fat_time = ((u64) buf[3] << 40) | ((u64) buf[2] << 32) |
+ hw_counter = ((u64) buf[3] << 40) | ((u64) buf[2] << 32) |
((u64) buf[1] << 24) | ((u64) buf[0] << 16);
- time = (unsigned long) (fat_time / (u64) (AB3100_RTC_CLOCK_RATE * 2));
+ time = hw_counter / (u64) (AB3100_RTC_CLOCK_RATE * 2);

- rtc_time_to_tm(time, &alarm->time);
+ rtc_time64_to_tm(time, &alarm->time);

return rtc_valid_tm(&alarm->time);
}
@@ -147,17 +146,17 @@ static int ab3100_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alarm)
{
u8 regs[] = {AB3100_AL0, AB3100_AL1, AB3100_AL2, AB3100_AL3};
unsigned char buf[4];
- unsigned long secs;
- u64 fat_time;
+ time64_t secs;
+ u64 hw_counter;
int err;
int i;

- rtc_tm_to_time(&alarm->time, &secs);
- fat_time = (u64) secs * AB3100_RTC_CLOCK_RATE * 2;
- buf[0] = (fat_time >> 16) & 0xFF;
- buf[1] = (fat_time >> 24) & 0xFF;
- buf[2] = (fat_time >> 32) & 0xFF;
- buf[3] = (fat_time >> 40) & 0xFF;
+ secs = rtc_tm_to_time64(&alarm->time);
+ hw_counter = secs * AB3100_RTC_CLOCK_RATE * 2;
+ buf[0] = (hw_counter >> 16) & 0xFF;
+ buf[1] = (hw_counter >> 24) & 0xFF;
+ buf[2] = (hw_counter >> 32) & 0xFF;
+ buf[3] = (hw_counter >> 40) & 0xFF;

/* Set the alarm */
for (i = 0; i < 4; i++) {
@@ -193,7 +192,7 @@ static int ab3100_rtc_irq_enable(struct device *dev, unsigned int enabled)

static const struct rtc_class_ops ab3100_rtc_ops = {
.read_time = ab3100_rtc_read_time,
- .set_mmss = ab3100_rtc_set_mmss,
+ .set_mmss64 = ab3100_rtc_set_mmss,
.read_alarm = ab3100_rtc_read_alarm,
.set_alarm = ab3100_rtc_set_alarm,
.alarm_irq_enable = ab3100_rtc_irq_enable,
diff --git a/drivers/rtc/rtc-mc13xxx.c b/drivers/rtc/rtc-mc13xxx.c
index 5bce904b7ee6..32df1d812367 100644
--- a/drivers/rtc/rtc-mc13xxx.c
+++ b/drivers/rtc/rtc-mc13xxx.c
@@ -83,20 +83,19 @@ static int mc13xxx_rtc_read_time(struct device *dev, struct rtc_time *tm)
return ret;
} while (days1 != days2);

- rtc_time_to_tm(days1 * SEC_PER_DAY + seconds, tm);
+ rtc_time64_to_tm((time64_t)days1 * SEC_PER_DAY + seconds, tm);

return rtc_valid_tm(tm);
}

-static int mc13xxx_rtc_set_mmss(struct device *dev, unsigned long secs)
+static int mc13xxx_rtc_set_mmss(struct device *dev, time64_t secs)
{
struct mc13xxx_rtc *priv = dev_get_drvdata(dev);
unsigned int seconds, days;
unsigned int alarmseconds;
int ret;

- seconds = secs % SEC_PER_DAY;
- days = secs / SEC_PER_DAY;
+ days = div_s64_rem(secs, SEC_PER_DAY, &seconds);

mc13xxx_lock(priv->mc13xxx);

@@ -159,7 +158,7 @@ static int mc13xxx_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *alarm)
{
struct mc13xxx_rtc *priv = dev_get_drvdata(dev);
unsigned seconds, days;
- unsigned long s1970;
+ time64_t s1970;
int enabled, pending;
int ret;

@@ -189,10 +188,10 @@ static int mc13xxx_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *alarm)
alarm->enabled = enabled;
alarm->pending = pending;

- s1970 = days * SEC_PER_DAY + seconds;
+ s1970 = (time64_t)days * SEC_PER_DAY + seconds;

- rtc_time_to_tm(s1970, &alarm->time);
- dev_dbg(dev, "%s: %lu\n", __func__, s1970);
+ rtc_time64_to_tm(s1970, &alarm->time);
+ dev_dbg(dev, "%s: %lld\n", __func__, (long long)s1970);

return 0;
}
@@ -200,8 +199,8 @@ static int mc13xxx_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *alarm)
static int mc13xxx_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alarm)
{
struct mc13xxx_rtc *priv = dev_get_drvdata(dev);
- unsigned long s1970;
- unsigned seconds, days;
+ time64_t s1970;
+ u32 seconds, days;
int ret;

mc13xxx_lock(priv->mc13xxx);
@@ -215,20 +214,17 @@ static int mc13xxx_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alarm)
if (unlikely(ret))
goto out;

- ret = rtc_tm_to_time(&alarm->time, &s1970);
- if (unlikely(ret))
- goto out;
+ s1970 = rtc_tm_to_time64(&alarm->time);

- dev_dbg(dev, "%s: o%2.s %lu\n", __func__, alarm->enabled ? "n" : "ff",
- s1970);
+ dev_dbg(dev, "%s: o%2.s %lld\n", __func__, alarm->enabled ? "n" : "ff",
+ (long long)s1970);

ret = mc13xxx_rtc_irq_enable_unlocked(dev, alarm->enabled,
MC13XXX_IRQ_TODA);
if (unlikely(ret))
goto out;

- seconds = s1970 % SEC_PER_DAY;
- days = s1970 / SEC_PER_DAY;
+ days = div_s64_rem(s1970, SEC_PER_DAY, &seconds);

ret = mc13xxx_reg_write(priv->mc13xxx, MC13XXX_RTCDAYA, days);
if (unlikely(ret))
@@ -268,7 +264,7 @@ static irqreturn_t mc13xxx_rtc_update_handler(int irq, void *dev)

static const struct rtc_class_ops mc13xxx_rtc_ops = {
.read_time = mc13xxx_rtc_read_time,
- .set_mmss = mc13xxx_rtc_set_mmss,
+ .set_mmss64 = mc13xxx_rtc_set_mmss,
.read_alarm = mc13xxx_rtc_read_alarm,
.set_alarm = mc13xxx_rtc_set_alarm,
.alarm_irq_enable = mc13xxx_rtc_alarm_irq_enable,
diff --git a/drivers/rtc/rtc-mxc.c b/drivers/rtc/rtc-mxc.c
index 3c3f8d10ab43..09d422b9f7f7 100644
--- a/drivers/rtc/rtc-mxc.c
+++ b/drivers/rtc/rtc-mxc.c
@@ -106,7 +106,7 @@ static inline int is_imx1_rtc(struct rtc_plat_data *data)
* This function is used to obtain the RTC time or the alarm value in
* second.
*/
-static u32 get_alarm_or_time(struct device *dev, int time_alarm)
+static time64_t get_alarm_or_time(struct device *dev, int time_alarm)
{
struct platform_device *pdev = to_platform_device(dev);
struct rtc_plat_data *pdata = platform_get_drvdata(pdev);
@@ -129,29 +129,28 @@ static u32 get_alarm_or_time(struct device *dev, int time_alarm)
hr = hr_min >> 8;
min = hr_min & 0xff;

- return (((day * 24 + hr) * 60) + min) * 60 + sec;
+ return ((((time64_t)day * 24 + hr) * 60) + min) * 60 + sec;
}

/*
* This function sets the RTC alarm value or the time value.
*/
-static void set_alarm_or_time(struct device *dev, int time_alarm, u32 time)
+static void set_alarm_or_time(struct device *dev, int time_alarm, time64_t time)
{
- u32 day, hr, min, sec, temp;
+ u32 tod, day, hr, min, sec, temp;
struct platform_device *pdev = to_platform_device(dev);
struct rtc_plat_data *pdata = platform_get_drvdata(pdev);
void __iomem *ioaddr = pdata->ioaddr;

- day = time / 86400;
- time -= day * 86400;
+ day = div_s64_rem(time, 86400, &tod);

/* time is within a day now */
- hr = time / 3600;
- time -= hr * 3600;
+ hr = tod / 3600;
+ tod -= hr * 3600;

/* time is within an hour now */
- min = time / 60;
- sec = time - min * 60;
+ min = tod / 60;
+ sec = tod - min * 60;

temp = (hr << 8) + min;

@@ -173,29 +172,18 @@ static void set_alarm_or_time(struct device *dev, int time_alarm, u32 time)
* This function updates the RTC alarm registers and then clears all the
* interrupt status bits.
*/
-static int rtc_update_alarm(struct device *dev, struct rtc_time *alrm)
+static void rtc_update_alarm(struct device *dev, struct rtc_time *alrm)
{
- struct rtc_time alarm_tm, now_tm;
- unsigned long now, time;
+ time64_t time;
struct platform_device *pdev = to_platform_device(dev);
struct rtc_plat_data *pdata = platform_get_drvdata(pdev);
void __iomem *ioaddr = pdata->ioaddr;

- now = get_alarm_or_time(dev, MXC_RTC_TIME);
- rtc_time_to_tm(now, &now_tm);
- alarm_tm.tm_year = now_tm.tm_year;
- alarm_tm.tm_mon = now_tm.tm_mon;
- alarm_tm.tm_mday = now_tm.tm_mday;
- alarm_tm.tm_hour = alrm->tm_hour;
- alarm_tm.tm_min = alrm->tm_min;
- alarm_tm.tm_sec = alrm->tm_sec;
- rtc_tm_to_time(&alarm_tm, &time);
+ time = rtc_tm_to_time64(alrm);

/* clear all the interrupt status bits */
writew(readw(ioaddr + RTC_RTCISR), ioaddr + RTC_RTCISR);
set_alarm_or_time(dev, MXC_RTC_ALARM, time);
-
- return 0;
}

static void mxc_rtc_irq_enable(struct device *dev, unsigned int bit,
@@ -283,14 +271,14 @@ static int mxc_rtc_alarm_irq_enable(struct device *dev, unsigned int enabled)
*/
static int mxc_rtc_read_time(struct device *dev, struct rtc_time *tm)
{
- u32 val;
+ time64_t val;

/* Avoid roll-over from reading the different registers */
do {
val = get_alarm_or_time(dev, MXC_RTC_TIME);
} while (val != get_alarm_or_time(dev, MXC_RTC_TIME));

- rtc_time_to_tm(val, tm);
+ rtc_time64_to_tm(val, tm);

return 0;
}
@@ -298,7 +286,7 @@ static int mxc_rtc_read_time(struct device *dev, struct rtc_time *tm)
/*
* This function sets the internal RTC time based on tm in Gregorian date.
*/
-static int mxc_rtc_set_mmss(struct device *dev, unsigned long time)
+static int mxc_rtc_set_mmss(struct device *dev, time64_t time)
{
struct platform_device *pdev = to_platform_device(dev);
struct rtc_plat_data *pdata = platform_get_drvdata(pdev);
@@ -309,9 +297,9 @@ static int mxc_rtc_set_mmss(struct device *dev, unsigned long time)
if (is_imx1_rtc(pdata)) {
struct rtc_time tm;

- rtc_time_to_tm(time, &tm);
+ rtc_time64_to_tm(time, &tm);
tm.tm_year = 70;
- rtc_tm_to_time(&tm, &time);
+ time = rtc_tm_to_time64(&tm);
}

/* Avoid roll-over from reading the different registers */
@@ -333,7 +321,7 @@ static int mxc_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *alrm)
struct rtc_plat_data *pdata = platform_get_drvdata(pdev);
void __iomem *ioaddr = pdata->ioaddr;

- rtc_time_to_tm(get_alarm_or_time(dev, MXC_RTC_ALARM), &alrm->time);
+ rtc_time64_to_tm(get_alarm_or_time(dev, MXC_RTC_ALARM), &alrm->time);
alrm->pending = ((readw(ioaddr + RTC_RTCISR) & RTC_ALM_BIT)) ? 1 : 0;

return 0;
@@ -346,11 +334,8 @@ static int mxc_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm)
{
struct platform_device *pdev = to_platform_device(dev);
struct rtc_plat_data *pdata = platform_get_drvdata(pdev);
- int ret;

- ret = rtc_update_alarm(dev, &alrm->time);
- if (ret)
- return ret;
+ rtc_update_alarm(dev, &alrm->time);

memcpy(&pdata->g_rtc_alarm, &alrm->time, sizeof(struct rtc_time));
mxc_rtc_irq_enable(dev, RTC_ALM_BIT, alrm->enabled);
@@ -362,7 +347,7 @@ static int mxc_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm)
static struct rtc_class_ops mxc_rtc_ops = {
.release = mxc_rtc_release,
.read_time = mxc_rtc_read_time,
- .set_mmss = mxc_rtc_set_mmss,
+ .set_mmss64 = mxc_rtc_set_mmss,
.read_alarm = mxc_rtc_read_alarm,
.set_alarm = mxc_rtc_set_alarm,
.alarm_irq_enable = mxc_rtc_alarm_irq_enable,
diff --git a/drivers/rtc/rtc-test.c b/drivers/rtc/rtc-test.c
index 8f86fa91de1a..3a2da4c892d6 100644
--- a/drivers/rtc/rtc-test.c
+++ b/drivers/rtc/rtc-test.c
@@ -13,6 +13,10 @@
#include <linux/rtc.h>
#include <linux/platform_device.h>

+static int test_mmss64;
+module_param(test_mmss64, int, 0644);
+MODULE_PARM_DESC(test_mmss64, "Test struct rtc_class_ops.set_mmss64().");
+
static struct platform_device *test0 = NULL, *test1 = NULL;

static int test_rtc_read_alarm(struct device *dev,
@@ -30,7 +34,13 @@ static int test_rtc_set_alarm(struct device *dev,
static int test_rtc_read_time(struct device *dev,
struct rtc_time *tm)
{
- rtc_time_to_tm(get_seconds(), tm);
+ rtc_time64_to_tm(ktime_get_real_seconds(), tm);
+ return 0;
+}
+
+static int test_rtc_set_mmss64(struct device *dev, time64_t secs)
+{
+ dev_info(dev, "%s, secs = %lld\n", __func__, (long long)secs);
return 0;
}

@@ -55,7 +65,7 @@ static int test_rtc_alarm_irq_enable(struct device *dev, unsigned int enable)
return 0;
}

-static const struct rtc_class_ops test_rtc_ops = {
+static struct rtc_class_ops test_rtc_ops = {
.proc = test_rtc_proc,
.read_time = test_rtc_read_time,
.read_alarm = test_rtc_read_alarm,
@@ -101,6 +111,11 @@ static int test_probe(struct platform_device *plat_dev)
int err;
struct rtc_device *rtc;

+ if (test_mmss64) {
+ test_rtc_ops.set_mmss64 = test_rtc_set_mmss64;
+ test_rtc_ops.set_mmss = NULL;
+ }
+
rtc = devm_rtc_device_register(&plat_dev->dev, "test",
&test_rtc_ops, THIS_MODULE);
if (IS_ERR(rtc)) {
diff --git a/drivers/rtc/systohc.c b/drivers/rtc/systohc.c
index eb71872d0361..7728d5e32bf4 100644
--- a/drivers/rtc/systohc.c
+++ b/drivers/rtc/systohc.c
@@ -11,7 +11,7 @@
* rtc_set_ntp_time - Save NTP synchronized time to the RTC
* @now: Current time of day
*
- * Replacement for the NTP platform function update_persistent_clock
+ * Replacement for the NTP platform function update_persistent_clock64
* that stores time for later retrieval by rtc_hctosys.
*
* Returns 0 on successful RTC update, -ENODEV if a RTC update is not
@@ -35,7 +35,10 @@ int rtc_set_ntp_time(struct timespec64 now)
if (rtc) {
/* rtc_hctosys exclusively uses UTC, so we call set_time here,
* not set_mmss. */
- if (rtc->ops && (rtc->ops->set_time || rtc->ops->set_mmss))
+ if (rtc->ops &&
+ (rtc->ops->set_time ||
+ rtc->ops->set_mmss64 ||
+ rtc->ops->set_mmss))
err = rtc_set_time(rtc, &tm);
rtc_class_close(rtc);
}
diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h
index 2e4cb67f6e56..96c280b2c263 100644
--- a/include/linux/clockchips.h
+++ b/include/linux/clockchips.h
@@ -8,33 +8,19 @@
#ifndef _LINUX_CLOCKCHIPS_H
#define _LINUX_CLOCKCHIPS_H

-/* Clock event notification values */
-enum clock_event_nofitiers {
- CLOCK_EVT_NOTIFY_ADD,
- CLOCK_EVT_NOTIFY_BROADCAST_ON,
- CLOCK_EVT_NOTIFY_BROADCAST_OFF,
- CLOCK_EVT_NOTIFY_BROADCAST_FORCE,
- CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
- CLOCK_EVT_NOTIFY_BROADCAST_EXIT,
- CLOCK_EVT_NOTIFY_SUSPEND,
- CLOCK_EVT_NOTIFY_RESUME,
- CLOCK_EVT_NOTIFY_CPU_DYING,
- CLOCK_EVT_NOTIFY_CPU_DEAD,
-};
-
-#ifdef CONFIG_GENERIC_CLOCKEVENTS_BUILD
+#ifdef CONFIG_GENERIC_CLOCKEVENTS

-#include <linux/clocksource.h>
-#include <linux/cpumask.h>
-#include <linux/ktime.h>
-#include <linux/notifier.h>
+# include <linux/clocksource.h>
+# include <linux/cpumask.h>
+# include <linux/ktime.h>
+# include <linux/notifier.h>

struct clock_event_device;
struct module;

-/* Clock event mode commands */
+/* Clock event mode commands for legacy ->set_mode(): OBSOLETE */
enum clock_event_mode {
- CLOCK_EVT_MODE_UNUSED = 0,
+ CLOCK_EVT_MODE_UNUSED,
CLOCK_EVT_MODE_SHUTDOWN,
CLOCK_EVT_MODE_PERIODIC,
CLOCK_EVT_MODE_ONESHOT,
@@ -42,30 +28,49 @@ enum clock_event_mode {
};

/*
+ * Possible states of a clock event device.
+ *
+ * DETACHED: Device is not used by clockevents core. Initial state or can be
+ * reached from SHUTDOWN.
+ * SHUTDOWN: Device is powered-off. Can be reached from PERIODIC or ONESHOT.
+ * PERIODIC: Device is programmed to generate events periodically. Can be
+ * reached from DETACHED or SHUTDOWN.
+ * ONESHOT: Device is programmed to generate event only once. Can be reached
+ * from DETACHED or SHUTDOWN.
+ */
+enum clock_event_state {
+ CLOCK_EVT_STATE_DETACHED,
+ CLOCK_EVT_STATE_SHUTDOWN,
+ CLOCK_EVT_STATE_PERIODIC,
+ CLOCK_EVT_STATE_ONESHOT,
+};
+
+/*
* Clock event features
*/
-#define CLOCK_EVT_FEAT_PERIODIC 0x000001
-#define CLOCK_EVT_FEAT_ONESHOT 0x000002
-#define CLOCK_EVT_FEAT_KTIME 0x000004
+# define CLOCK_EVT_FEAT_PERIODIC 0x000001
+# define CLOCK_EVT_FEAT_ONESHOT 0x000002
+# define CLOCK_EVT_FEAT_KTIME 0x000004
+
/*
- * x86(64) specific misfeatures:
+ * x86(64) specific (mis)features:
*
* - Clockevent source stops in C3 State and needs broadcast support.
* - Local APIC timer is used as a dummy device.
*/
-#define CLOCK_EVT_FEAT_C3STOP 0x000008
-#define CLOCK_EVT_FEAT_DUMMY 0x000010
+# define CLOCK_EVT_FEAT_C3STOP 0x000008
+# define CLOCK_EVT_FEAT_DUMMY 0x000010

/*
* Core shall set the interrupt affinity dynamically in broadcast mode
*/
-#define CLOCK_EVT_FEAT_DYNIRQ 0x000020
-#define CLOCK_EVT_FEAT_PERCPU 0x000040
+# define CLOCK_EVT_FEAT_DYNIRQ 0x000020
+# define CLOCK_EVT_FEAT_PERCPU 0x000040

/*
* Clockevent device is based on a hrtimer for broadcast
*/
-#define CLOCK_EVT_FEAT_HRTIMER 0x000080
+# define CLOCK_EVT_FEAT_HRTIMER 0x000080

/**
* struct clock_event_device - clock event device descriptor
@@ -78,10 +83,15 @@ enum clock_event_mode {
* @min_delta_ns: minimum delta value in ns
* @mult: nanosecond to cycles multiplier
* @shift: nanoseconds to cycles divisor (power of two)
- * @mode: operating mode assigned by the management code
+ * @mode: operating mode, relevant only to ->set_mode(), OBSOLETE
+ * @state: current state of the device, assigned by the core code
* @features: features
* @retries: number of forced programming retries
- * @set_mode: set mode function
+ * @set_mode: legacy set mode function, only for modes <= CLOCK_EVT_MODE_RESUME.
+ * @set_state_periodic: switch state to periodic, if !set_mode
+ * @set_state_oneshot: switch state to oneshot, if !set_mode
+ * @set_state_shutdown: switch state to shutdown, if !set_mode
+ * @tick_resume: resume clkevt device, if !set_mode
* @broadcast: function to broadcast events
* @min_delta_ticks: minimum delta value in ticks stored for reconfiguration
* @max_delta_ticks: maximum delta value in ticks stored for reconfiguration
@@ -95,22 +105,31 @@ enum clock_event_mode {
*/
struct clock_event_device {
void (*event_handler)(struct clock_event_device *);
- int (*set_next_event)(unsigned long evt,
- struct clock_event_device *);
- int (*set_next_ktime)(ktime_t expires,
- struct clock_event_device *);
+ int (*set_next_event)(unsigned long evt, struct clock_event_device *);
+ int (*set_next_ktime)(ktime_t expires, struct clock_event_device *);
ktime_t next_event;
u64 max_delta_ns;
u64 min_delta_ns;
u32 mult;
u32 shift;
enum clock_event_mode mode;
+ enum clock_event_state state;
unsigned int features;
unsigned long retries;

+ /*
+ * State transition callback(s): Only one of the two groups should be
+ * defined:
+ * - set_mode(), only for modes <= CLOCK_EVT_MODE_RESUME.
+ * - set_state_{shutdown|periodic|oneshot}(), tick_resume().
+ */
+ void (*set_mode)(enum clock_event_mode mode, struct clock_event_device *);
+ int (*set_state_periodic)(struct clock_event_device *);
+ int (*set_state_oneshot)(struct clock_event_device *);
+ int (*set_state_shutdown)(struct clock_event_device *);
+ int (*tick_resume)(struct clock_event_device *);
+
void (*broadcast)(const struct cpumask *mask);
- void (*set_mode)(enum clock_event_mode mode,
- struct clock_event_device *);
void (*suspend)(struct clock_event_device *);
void (*resume)(struct clock_event_device *);
unsigned long min_delta_ticks;
@@ -136,18 +155,18 @@ struct clock_event_device {
*
* factor = (clock_ticks << shift) / nanoseconds
*/
-static inline unsigned long div_sc(unsigned long ticks, unsigned long nsec,
- int shift)
+static inline unsigned long
+div_sc(unsigned long ticks, unsigned long nsec, int shift)
{
- uint64_t tmp = ((uint64_t)ticks) << shift;
+ u64 tmp = ((u64)ticks) << shift;

do_div(tmp, nsec);
+
return (unsigned long) tmp;
}

/* Clock event layer functions */
-extern u64 clockevent_delta2ns(unsigned long latch,
- struct clock_event_device *evt);
+extern u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt);
extern void clockevents_register_device(struct clock_event_device *dev);
extern int clockevents_unbind_device(struct clock_event_device *ced, int cpu);

@@ -158,57 +177,42 @@ extern void clockevents_config_and_register(struct clock_event_device *dev,

extern int clockevents_update_freq(struct clock_event_device *ce, u32 freq);

-extern void clockevents_exchange_device(struct clock_event_device *old,
- struct clock_event_device *new);
-extern void clockevents_set_mode(struct clock_event_device *dev,
- enum clock_event_mode mode);
-extern int clockevents_program_event(struct clock_event_device *dev,
- ktime_t expires, bool force);
-
-extern void clockevents_handle_noop(struct clock_event_device *dev);
-
static inline void
clockevents_calc_mult_shift(struct clock_event_device *ce, u32 freq, u32 minsec)
{
- return clocks_calc_mult_shift(&ce->mult, &ce->shift, NSEC_PER_SEC,
- freq, minsec);
+ return clocks_calc_mult_shift(&ce->mult, &ce->shift, NSEC_PER_SEC, freq, minsec);
}

extern void clockevents_suspend(void);
extern void clockevents_resume(void);

-#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
-#ifdef CONFIG_ARCH_HAS_TICK_BROADCAST
+# ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
+# ifdef CONFIG_ARCH_HAS_TICK_BROADCAST
extern void tick_broadcast(const struct cpumask *mask);
-#else
-#define tick_broadcast NULL
-#endif
+# else
+# define tick_broadcast NULL
+# endif
extern int tick_receive_broadcast(void);
-#endif
+# endif

-#if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_TICK_ONESHOT)
+# if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_TICK_ONESHOT)
extern void tick_setup_hrtimer_broadcast(void);
extern int tick_check_broadcast_expired(void);
-#else
+# else
static inline int tick_check_broadcast_expired(void) { return 0; }
-static inline void tick_setup_hrtimer_broadcast(void) {};
-#endif
+static inline void tick_setup_hrtimer_broadcast(void) { }
+# endif

-#ifdef CONFIG_GENERIC_CLOCKEVENTS
extern int clockevents_notify(unsigned long reason, void *arg);
-#else
-static inline int clockevents_notify(unsigned long reason, void *arg) { return 0; }
-#endif
-
-#else /* CONFIG_GENERIC_CLOCKEVENTS_BUILD */

-static inline void clockevents_suspend(void) {}
-static inline void clockevents_resume(void) {}
+#else /* !CONFIG_GENERIC_CLOCKEVENTS: */

+static inline void clockevents_suspend(void) { }
+static inline void clockevents_resume(void) { }
static inline int clockevents_notify(unsigned long reason, void *arg) { return 0; }
static inline int tick_check_broadcast_expired(void) { return 0; }
-static inline void tick_setup_hrtimer_broadcast(void) {};
+static inline void tick_setup_hrtimer_broadcast(void) { }

-#endif
+#endif /* !CONFIG_GENERIC_CLOCKEVENTS */

-#endif
+#endif /* _LINUX_CLOCKCHIPS_H */
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 9c78d15d33e4..135509821c39 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -56,6 +56,7 @@ struct module;
* @shift: cycle to nanosecond divisor (power of two)
* @max_idle_ns: max idle time permitted by the clocksource (nsecs)
* @maxadj: maximum adjustment value to mult (~11%)
+ * @max_cycles: maximum safe cycle value which won't overflow on multiplication
* @flags: flags describing special properties
* @archdata: arch-specific data
* @suspend: suspend function for the clocksource, if necessary
@@ -76,7 +77,7 @@ struct clocksource {
#ifdef CONFIG_ARCH_CLOCKSOURCE_DATA
struct arch_clocksource_data archdata;
#endif
-
+ u64 max_cycles;
const char *name;
struct list_head list;
int rating;
@@ -178,7 +179,6 @@ static inline s64 clocksource_cyc2ns(cycle_t cycles, u32 mult, u32 shift)
}


-extern int clocksource_register(struct clocksource*);
extern int clocksource_unregister(struct clocksource*);
extern void clocksource_touch_watchdog(void);
extern struct clocksource* clocksource_get_next(void);
@@ -189,7 +189,7 @@ extern struct clocksource * __init clocksource_default_clock(void);
extern void clocksource_mark_unstable(struct clocksource *cs);

extern u64
-clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask);
+clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask, u64 *max_cycles);
extern void
clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);

@@ -200,7 +200,16 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 minsec);
extern int
__clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq);
extern void
-__clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq);
+__clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq);
+
+/*
+ * Don't call this unless you are a default clocksource
+ * (AKA: jiffies) and absolutely have to.
+ */
+static inline int __clocksource_register(struct clocksource *cs)
+{
+ return __clocksource_register_scale(cs, 1, 0);
+}

static inline int clocksource_register_hz(struct clocksource *cs, u32 hz)
{
@@ -212,14 +221,14 @@ static inline int clocksource_register_khz(struct clocksource *cs, u32 khz)
return __clocksource_register_scale(cs, 1000, khz);
}

-static inline void __clocksource_updatefreq_hz(struct clocksource *cs, u32 hz)
+static inline void __clocksource_update_freq_hz(struct clocksource *cs, u32 hz)
{
- __clocksource_updatefreq_scale(cs, 1, hz);
+ __clocksource_update_freq_scale(cs, 1, hz);
}

-static inline void __clocksource_updatefreq_khz(struct clocksource *cs, u32 khz)
+static inline void __clocksource_update_freq_khz(struct clocksource *cs, u32 khz)
{
- __clocksource_updatefreq_scale(cs, 1000, khz);
+ __clocksource_update_freq_scale(cs, 1000, khz);
}


diff --git a/include/linux/rtc.h b/include/linux/rtc.h
index dcad7ee0d746..8dcf6825fa88 100644
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -77,6 +77,7 @@ struct rtc_class_ops {
int (*read_alarm)(struct device *, struct rtc_wkalrm *);
int (*set_alarm)(struct device *, struct rtc_wkalrm *);
int (*proc)(struct device *, struct seq_file *);
+ int (*set_mmss64)(struct device *, time64_t secs);
int (*set_mmss)(struct device *, unsigned long secs);
int (*read_callback)(struct device *, int data);
int (*alarm_irq_enable)(struct device *, unsigned int enabled);
diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9c085dc12ae9..f8492da57ad3 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -1,7 +1,5 @@
-/* linux/include/linux/tick.h
- *
- * This file contains the structure definitions for tick related functions
- *
+/*
+ * Tick related global functions
*/
#ifndef _LINUX_TICK_H
#define _LINUX_TICK_H
@@ -9,149 +7,99 @@
#include <linux/clockchips.h>
#include <linux/irqflags.h>
#include <linux/percpu.h>
-#include <linux/hrtimer.h>
#include <linux/context_tracking_state.h>
#include <linux/cpumask.h>
#include <linux/sched.h>

#ifdef CONFIG_GENERIC_CLOCKEVENTS
-
-enum tick_device_mode {
- TICKDEV_MODE_PERIODIC,
- TICKDEV_MODE_ONESHOT,
-};
-
-struct tick_device {
- struct clock_event_device *evtdev;
- enum tick_device_mode mode;
-};
-
-enum tick_nohz_mode {
- NOHZ_MODE_INACTIVE,
- NOHZ_MODE_LOWRES,
- NOHZ_MODE_HIGHRES,
-};
-
-/**
- * struct tick_sched - sched tick emulation and no idle tick control/stats
- * @sched_timer: hrtimer to schedule the periodic tick in high
- * resolution mode
- * @last_tick: Store the last tick expiry time when the tick
- * timer is modified for nohz sleeps. This is necessary
- * to resume the tick timer operation in the timeline
- * when the CPU returns from nohz sleep.
- * @tick_stopped: Indicator that the idle tick has been stopped
- * @idle_jiffies: jiffies at the entry to idle for idle time accounting
- * @idle_calls: Total number of idle calls
- * @idle_sleeps: Number of idle calls, where the sched tick was stopped
- * @idle_entrytime: Time when the idle call was entered
- * @idle_waketime: Time when the idle was interrupted
- * @idle_exittime: Time when the idle state was left
- * @idle_sleeptime: Sum of the time slept in idle with sched tick stopped
- * @iowait_sleeptime: Sum of the time slept in idle with sched tick stopped, with IO outstanding
- * @sleep_length: Duration of the current idle sleep
- * @do_timer_lst: CPU was the last one doing do_timer before going idle
- */
-struct tick_sched {
- struct hrtimer sched_timer;
- unsigned long check_clocks;
- enum tick_nohz_mode nohz_mode;
- ktime_t last_tick;
- int inidle;
- int tick_stopped;
- unsigned long idle_jiffies;
- unsigned long idle_calls;
- unsigned long idle_sleeps;
- int idle_active;
- ktime_t idle_entrytime;
- ktime_t idle_waketime;
- ktime_t idle_exittime;
- ktime_t idle_sleeptime;
- ktime_t iowait_sleeptime;
- ktime_t sleep_length;
- unsigned long last_jiffies;
- unsigned long next_jiffies;
- ktime_t idle_expires;
- int do_timer_last;
-};
-
extern void __init tick_init(void);
-extern int tick_is_oneshot_available(void);
-extern struct tick_device *tick_get_device(int cpu);
-
extern void tick_freeze(void);
extern void tick_unfreeze(void);
+/* Should be core only, but ARM BL switcher requires it */
+extern void tick_suspend_local(void);
+/* Should be core only, but XEN resume magic and ARM BL switcher require it */
+extern void tick_resume_local(void);
+extern void tick_handover_do_timer(void);
+extern void tick_cleanup_dead_cpu(int cpu);
+#else /* CONFIG_GENERIC_CLOCKEVENTS */
+static inline void tick_init(void) { }
+static inline void tick_freeze(void) { }
+static inline void tick_unfreeze(void) { }
+static inline void tick_suspend_local(void) { }
+static inline void tick_resume_local(void) { }
+static inline void tick_handover_do_timer(void) { }
+static inline void tick_cleanup_dead_cpu(int cpu) { }
+#endif /* !CONFIG_GENERIC_CLOCKEVENTS */

-# ifdef CONFIG_HIGH_RES_TIMERS
-extern int tick_init_highres(void);
-extern int tick_program_event(ktime_t expires, int force);
-extern void tick_setup_sched_timer(void);
-# endif
-
-# if defined CONFIG_NO_HZ_COMMON || defined CONFIG_HIGH_RES_TIMERS
-extern void tick_cancel_sched_timer(int cpu);
-# else
-static inline void tick_cancel_sched_timer(int cpu) { }
-# endif
-
-# ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
-extern struct tick_device *tick_get_broadcast_device(void);
-extern struct cpumask *tick_get_broadcast_mask(void);
-
-# ifdef CONFIG_TICK_ONESHOT
-extern struct cpumask *tick_get_broadcast_oneshot_mask(void);
-# endif
-
-# endif /* BROADCAST */
-
-# ifdef CONFIG_TICK_ONESHOT
-extern void tick_clock_notify(void);
-extern int tick_check_oneshot_change(int allow_nohz);
-extern struct tick_sched *tick_get_tick_sched(int cpu);
+#ifdef CONFIG_TICK_ONESHOT
extern void tick_irq_enter(void);
-extern int tick_oneshot_mode_active(void);
# ifndef arch_needs_cpu
# define arch_needs_cpu() (0)
# endif
# else
-static inline void tick_clock_notify(void) { }
-static inline int tick_check_oneshot_change(int allow_nohz) { return 0; }
static inline void tick_irq_enter(void) { }
-static inline int tick_oneshot_mode_active(void) { return 0; }
-# endif
+#endif

-#else /* CONFIG_GENERIC_CLOCKEVENTS */
-static inline void tick_init(void) { }
-static inline void tick_freeze(void) { }
-static inline void tick_unfreeze(void) { }
-static inline void tick_cancel_sched_timer(int cpu) { }
-static inline void tick_clock_notify(void) { }
-static inline int tick_check_oneshot_change(int allow_nohz) { return 0; }
-static inline void tick_irq_enter(void) { }
-static inline int tick_oneshot_mode_active(void) { return 0; }
-#endif /* !CONFIG_GENERIC_CLOCKEVENTS */
+#if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_TICK_ONESHOT)
+extern void hotplug_cpu__broadcast_tick_pull(int dead_cpu);
+#else
+static inline void hotplug_cpu__broadcast_tick_pull(int dead_cpu) { }
+#endif

-# ifdef CONFIG_NO_HZ_COMMON
-DECLARE_PER_CPU(struct tick_sched, tick_cpu_sched);
+enum tick_broadcast_mode {
+ TICK_BROADCAST_OFF,
+ TICK_BROADCAST_ON,
+ TICK_BROADCAST_FORCE,
+};
+
+enum tick_broadcast_state {
+ TICK_BROADCAST_EXIT,
+ TICK_BROADCAST_ENTER,
+};
+
+#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
+extern void tick_broadcast_control(enum tick_broadcast_mode mode);
+#else
+static inline void tick_broadcast_control(enum tick_broadcast_mode mode) { }
+#endif /* BROADCAST */
+
+#if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_TICK_ONESHOT)
+extern int tick_broadcast_oneshot_control(enum tick_broadcast_state state);
+#else
+static inline int tick_broadcast_oneshot_control(enum tick_broadcast_state state) { return 0; }
+#endif

-static inline int tick_nohz_tick_stopped(void)
+static inline void tick_broadcast_enable(void)
+{
+ tick_broadcast_control(TICK_BROADCAST_ON);
+}
+static inline void tick_broadcast_disable(void)
+{
+ tick_broadcast_control(TICK_BROADCAST_OFF);
+}
+static inline void tick_broadcast_force(void)
+{
+ tick_broadcast_control(TICK_BROADCAST_FORCE);
+}
+static inline int tick_broadcast_enter(void)
{
- return __this_cpu_read(tick_cpu_sched.tick_stopped);
+ return tick_broadcast_oneshot_control(TICK_BROADCAST_ENTER);
+}
+static inline void tick_broadcast_exit(void)
+{
+ tick_broadcast_oneshot_control(TICK_BROADCAST_EXIT);
}

+#ifdef CONFIG_NO_HZ_COMMON
+extern int tick_nohz_tick_stopped(void);
extern void tick_nohz_idle_enter(void);
extern void tick_nohz_idle_exit(void);
extern void tick_nohz_irq_exit(void);
extern ktime_t tick_nohz_get_sleep_length(void);
extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time);
extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time);
-
-# else /* !CONFIG_NO_HZ_COMMON */
-static inline int tick_nohz_tick_stopped(void)
-{
- return 0;
-}
-
+#else /* !CONFIG_NO_HZ_COMMON */
+static inline int tick_nohz_tick_stopped(void) { return 0; }
static inline void tick_nohz_idle_enter(void) { }
static inline void tick_nohz_idle_exit(void) { }

@@ -163,7 +111,7 @@ static inline ktime_t tick_nohz_get_sleep_length(void)
}
static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; }
static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
-# endif /* !CONFIG_NO_HZ_COMMON */
+#endif /* !CONFIG_NO_HZ_COMMON */

#ifdef CONFIG_NO_HZ_FULL
extern bool tick_nohz_full_running;
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 05af9a334893..fb86963859c7 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -16,16 +16,16 @@
* @read: Read function of @clock
* @mask: Bitmask for two's complement subtraction of non 64bit clocks
* @cycle_last: @clock cycle value at last update
- * @mult: NTP adjusted multiplier for scaled math conversion
+ * @mult: (NTP adjusted) multiplier for scaled math conversion
* @shift: Shift value for scaled math conversion
* @xtime_nsec: Shifted (fractional) nano seconds offset for readout
- * @base_mono: ktime_t (nanoseconds) base time for readout
+ * @base: ktime_t (nanoseconds) base time for readout
*
* This struct has size 56 byte on 64 bit. Together with a seqcount it
* occupies a single 64byte cache line.
*
* The struct is separate from struct timekeeper as it is also used
- * for a fast NMI safe accessor to clock monotonic.
+ * for a fast NMI safe accessors.
*/
struct tk_read_base {
struct clocksource *clock;
@@ -35,12 +35,13 @@ struct tk_read_base {
u32 mult;
u32 shift;
u64 xtime_nsec;
- ktime_t base_mono;
+ ktime_t base;
};

/**
* struct timekeeper - Structure holding internal timekeeping values.
- * @tkr: The readout base structure
+ * @tkr_mono: The readout base structure for CLOCK_MONOTONIC
+ * @tkr_raw: The readout base structure for CLOCK_MONOTONIC_RAW
* @xtime_sec: Current CLOCK_REALTIME time in seconds
* @ktime_sec: Current CLOCK_MONOTONIC time in seconds
* @wall_to_monotonic: CLOCK_REALTIME to CLOCK_MONOTONIC offset
@@ -48,7 +49,6 @@ struct tk_read_base {
* @offs_boot: Offset clock monotonic -> clock boottime
* @offs_tai: Offset clock monotonic -> clock tai
* @tai_offset: The current UTC to TAI offset in seconds
- * @base_raw: Monotonic raw base time in ktime_t format
* @raw_time: Monotonic raw base time in timespec64 format
* @cycle_interval: Number of clock cycles in one NTP interval
* @xtime_interval: Number of clock shifted nano seconds in one NTP
@@ -76,7 +76,8 @@ struct tk_read_base {
* used instead.
*/
struct timekeeper {
- struct tk_read_base tkr;
+ struct tk_read_base tkr_mono;
+ struct tk_read_base tkr_raw;
u64 xtime_sec;
unsigned long ktime_sec;
struct timespec64 wall_to_monotonic;
@@ -84,7 +85,6 @@ struct timekeeper {
ktime_t offs_boot;
ktime_t offs_tai;
s32 tai_offset;
- ktime_t base_raw;
struct timespec64 raw_time;

/* The following members are for timekeeping internal use */
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index 3eaae4754275..99176af216af 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -214,12 +214,18 @@ static inline u64 ktime_get_boot_ns(void)
return ktime_to_ns(ktime_get_boottime());
}

+static inline u64 ktime_get_tai_ns(void)
+{
+ return ktime_to_ns(ktime_get_clocktai());
+}
+
static inline u64 ktime_get_raw_ns(void)
{
return ktime_to_ns(ktime_get_raw());
}

extern u64 ktime_get_mono_fast_ns(void);
+extern u64 ktime_get_raw_fast_ns(void);

/*
* Timespec interfaces utilizing the ktime based ones
@@ -242,6 +248,9 @@ static inline void timekeeping_clocktai(struct timespec *ts)
/*
* RTC specific
*/
+extern bool timekeeping_rtc_skipsuspend(void);
+extern bool timekeeping_rtc_skipresume(void);
+
extern void timekeeping_inject_sleeptime64(struct timespec64 *delta);

/*
@@ -253,17 +262,14 @@ extern void getnstime_raw_and_real(struct timespec *ts_raw,
/*
* Persistent clock related interfaces
*/
-extern bool persistent_clock_exist;
extern int persistent_clock_is_local;

-static inline bool has_persistent_clock(void)
-{
- return persistent_clock_exist;
-}
-
extern void read_persistent_clock(struct timespec *ts);
+extern void read_persistent_clock64(struct timespec64 *ts);
extern void read_boot_clock(struct timespec *ts);
+extern void read_boot_clock64(struct timespec64 *ts);
extern int update_persistent_clock(struct timespec now);
+extern int update_persistent_clock64(struct timespec64 now);


#endif
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 1972b161c61e..82eea9c5af61 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -20,6 +20,7 @@
#include <linux/gfp.h>
#include <linux/suspend.h>
#include <linux/lockdep.h>
+#include <linux/tick.h>
#include <trace/events/power.h>

#include "smpboot.h"
@@ -338,6 +339,8 @@ static int __ref take_cpu_down(void *_param)
return err;

cpu_notify(CPU_DYING | param->mod, param->hcpu);
+ /* Give up timekeeping duties */
+ tick_handover_do_timer();
/* Park the stopper thread */
kthread_park(current);
return 0;
@@ -411,10 +414,12 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
while (!idle_cpu(cpu))
cpu_relax();

+ hotplug_cpu__broadcast_tick_pull(cpu);
/* This actually kills the CPU. */
__cpu_die(cpu);

/* CPU is completely dead: tell everyone. Too late to complain. */
+ tick_cleanup_dead_cpu(cpu);
cpu_notify_nofail(CPU_DEAD | mod, hcpu);

check_for_tasks(cpu);
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 80014a178342..4d207d2abcbd 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -158,8 +158,7 @@ static void cpuidle_idle_call(void)
* is used from another cpu as a broadcast timer, this call may
* fail if it is not available
*/
- if (broadcast &&
- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &dev->cpu))
+ if (broadcast && tick_broadcast_enter())
goto use_default;

/* Take note of the planned idle state. */
@@ -176,7 +175,7 @@ static void cpuidle_idle_call(void)
idle_set_state(this_rq(), NULL);

if (broadcast)
- clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &dev->cpu);
+ tick_broadcast_exit();

/*
* Give the governor an opportunity to reflect on the outcome
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index d626dc98e8df..579ce1b929af 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -33,12 +33,6 @@ config ARCH_USES_GETTIMEOFFSET
config GENERIC_CLOCKEVENTS
bool

-# Migration helper. Builds, but does not invoke
-config GENERIC_CLOCKEVENTS_BUILD
- bool
- default y
- depends on GENERIC_CLOCKEVENTS
-
# Architecture can handle broadcast in a driver-agnostic way
config ARCH_HAS_TICK_BROADCAST
bool
diff --git a/kernel/time/Makefile b/kernel/time/Makefile
index c09c07817d7a..01f0312419b3 100644
--- a/kernel/time/Makefile
+++ b/kernel/time/Makefile
@@ -2,15 +2,13 @@ obj-y += time.o timer.o hrtimer.o itimer.o posix-timers.o posix-cpu-timers.o
obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o
obj-y += timeconv.o timecounter.o posix-clock.o alarmtimer.o

-obj-$(CONFIG_GENERIC_CLOCKEVENTS_BUILD) += clockevents.o
-obj-$(CONFIG_GENERIC_CLOCKEVENTS) += tick-common.o
+obj-$(CONFIG_GENERIC_CLOCKEVENTS) += clockevents.o tick-common.o
ifeq ($(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST),y)
obj-y += tick-broadcast.o
obj-$(CONFIG_TICK_ONESHOT) += tick-broadcast-hrtimer.o
endif
obj-$(CONFIG_GENERIC_SCHED_CLOCK) += sched_clock.o
-obj-$(CONFIG_TICK_ONESHOT) += tick-oneshot.o
-obj-$(CONFIG_TICK_ONESHOT) += tick-sched.o
+obj-$(CONFIG_TICK_ONESHOT) += tick-oneshot.o tick-sched.o
obj-$(CONFIG_TIMER_STATS) += timer_stats.o
obj-$(CONFIG_DEBUG_FS) += timekeeping_debug.o
obj-$(CONFIG_TEST_UDELAY) += test_udelay.o
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 55449909f114..25d942d1da27 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -94,25 +94,76 @@ u64 clockevent_delta2ns(unsigned long latch, struct clock_event_device *evt)
}
EXPORT_SYMBOL_GPL(clockevent_delta2ns);

+static int __clockevents_set_state(struct clock_event_device *dev,
+ enum clock_event_state state)
+{
+ /* Transition with legacy set_mode() callback */
+ if (dev->set_mode) {
+ /* Legacy callback doesn't support new modes */
+ if (state > CLOCK_EVT_STATE_ONESHOT)
+ return -ENOSYS;
+ /*
+ * 'clock_event_state' and 'clock_event_mode' have 1-to-1
+ * mapping until *_ONESHOT, and so a simple cast will work.
+ */
+ dev->set_mode((enum clock_event_mode)state, dev);
+ dev->mode = (enum clock_event_mode)state;
+ return 0;
+ }
+
+ if (dev->features & CLOCK_EVT_FEAT_DUMMY)
+ return 0;
+
+ /* Transition with new state-specific callbacks */
+ switch (state) {
+ case CLOCK_EVT_STATE_DETACHED:
+ /*
+ * This is an internal state, which is guaranteed to go from
+ * SHUTDOWN to DETACHED. No driver interaction required.
+ */
+ return 0;
+
+ case CLOCK_EVT_STATE_SHUTDOWN:
+ return dev->set_state_shutdown(dev);
+
+ case CLOCK_EVT_STATE_PERIODIC:
+ /* Core internal bug */
+ if (!(dev->features & CLOCK_EVT_FEAT_PERIODIC))
+ return -ENOSYS;
+ return dev->set_state_periodic(dev);
+
+ case CLOCK_EVT_STATE_ONESHOT:
+ /* Core internal bug */
+ if (!(dev->features & CLOCK_EVT_FEAT_ONESHOT))
+ return -ENOSYS;
+ return dev->set_state_oneshot(dev);
+
+ default:
+ return -ENOSYS;
+ }
+}
+
/**
- * clockevents_set_mode - set the operating mode of a clock event device
+ * clockevents_set_state - set the operating state of a clock event device
* @dev: device to modify
- * @mode: new mode
+ * @state: new state
*
* Must be called with interrupts disabled !
*/
-void clockevents_set_mode(struct clock_event_device *dev,
- enum clock_event_mode mode)
+void clockevents_set_state(struct clock_event_device *dev,
+ enum clock_event_state state)
{
- if (dev->mode != mode) {
- dev->set_mode(mode, dev);
- dev->mode = mode;
+ if (dev->state != state) {
+ if (__clockevents_set_state(dev, state))
+ return;
+
+ dev->state = state;

/*
* A nsec2cyc multiplicator of 0 is invalid and we'd crash
* on it, so fix it up and emit a warning:
*/
- if (mode == CLOCK_EVT_MODE_ONESHOT) {
+ if (state == CLOCK_EVT_STATE_ONESHOT) {
if (unlikely(!dev->mult)) {
dev->mult = 1;
WARN_ON(1);
@@ -127,10 +178,28 @@ void clockevents_set_mode(struct clock_event_device *dev,
*/
void clockevents_shutdown(struct clock_event_device *dev)
{
- clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
+ clockevents_set_state(dev, CLOCK_EVT_STATE_SHUTDOWN);
dev->next_event.tv64 = KTIME_MAX;
}

+/**
+ * clockevents_tick_resume - Resume the tick device before using it again
+ * @dev: device to resume
+ */
+int clockevents_tick_resume(struct clock_event_device *dev)
+{
+ int ret = 0;
+
+ if (dev->set_mode) {
+ dev->set_mode(CLOCK_EVT_MODE_RESUME, dev);
+ dev->mode = CLOCK_EVT_MODE_RESUME;
+ } else if (dev->tick_resume) {
+ ret = dev->tick_resume(dev);
+ }
+
+ return ret;
+}
+
#ifdef CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST

/* Limit min_delta to a jiffie */
@@ -183,7 +252,7 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
delta = dev->min_delta_ns;
dev->next_event = ktime_add_ns(ktime_get(), delta);

- if (dev->mode == CLOCK_EVT_MODE_SHUTDOWN)
+ if (dev->state == CLOCK_EVT_STATE_SHUTDOWN)
return 0;

dev->retries++;
@@ -220,7 +289,7 @@ static int clockevents_program_min_delta(struct clock_event_device *dev)
delta = dev->min_delta_ns;
dev->next_event = ktime_add_ns(ktime_get(), delta);

- if (dev->mode == CLOCK_EVT_MODE_SHUTDOWN)
+ if (dev->state == CLOCK_EVT_STATE_SHUTDOWN)
return 0;

dev->retries++;
@@ -252,7 +321,7 @@ int clockevents_program_event(struct clock_event_device *dev, ktime_t expires,

dev->next_event = expires;

- if (dev->mode == CLOCK_EVT_MODE_SHUTDOWN)
+ if (dev->state == CLOCK_EVT_STATE_SHUTDOWN)
return 0;

/* Shortcut for clockevent devices that can deal with ktime. */
@@ -297,7 +366,7 @@ static int clockevents_replace(struct clock_event_device *ced)
struct clock_event_device *dev, *newdev = NULL;

list_for_each_entry(dev, &clockevent_devices, list) {
- if (dev == ced || dev->mode != CLOCK_EVT_MODE_UNUSED)
+ if (dev == ced || dev->state != CLOCK_EVT_STATE_DETACHED)
continue;

if (!tick_check_replacement(newdev, dev))
@@ -323,7 +392,7 @@ static int clockevents_replace(struct clock_event_device *ced)
static int __clockevents_try_unbind(struct clock_event_device *ced, int cpu)
{
/* Fast track. Device is unused */
- if (ced->mode == CLOCK_EVT_MODE_UNUSED) {
+ if (ced->state == CLOCK_EVT_STATE_DETACHED) {
list_del_init(&ced->list);
return 0;
}
@@ -373,6 +442,37 @@ int clockevents_unbind_device(struct clock_event_device *ced, int cpu)
}
EXPORT_SYMBOL_GPL(clockevents_unbind);

+/* Sanity check of state transition callbacks */
+static int clockevents_sanity_check(struct clock_event_device *dev)
+{
+ /* Legacy set_mode() callback */
+ if (dev->set_mode) {
+ /* We shouldn't be supporting new modes now */
+ WARN_ON(dev->set_state_periodic || dev->set_state_oneshot ||
+ dev->set_state_shutdown || dev->tick_resume);
+
+ BUG_ON(dev->mode != CLOCK_EVT_MODE_UNUSED);
+ return 0;
+ }
+
+ if (dev->features & CLOCK_EVT_FEAT_DUMMY)
+ return 0;
+
+ /* New state-specific callbacks */
+ if (!dev->set_state_shutdown)
+ return -EINVAL;
+
+ if ((dev->features & CLOCK_EVT_FEAT_PERIODIC) &&
+ !dev->set_state_periodic)
+ return -EINVAL;
+
+ if ((dev->features & CLOCK_EVT_FEAT_ONESHOT) &&
+ !dev->set_state_oneshot)
+ return -EINVAL;
+
+ return 0;
+}
+
/**
* clockevents_register_device - register a clock event device
* @dev: device to register
@@ -381,7 +481,11 @@ void clockevents_register_device(struct clock_event_device *dev)
{
unsigned long flags;

- BUG_ON(dev->mode != CLOCK_EVT_MODE_UNUSED);
+ BUG_ON(clockevents_sanity_check(dev));
+
+ /* Initialize state to DETACHED */
+ dev->state = CLOCK_EVT_STATE_DETACHED;
+
if (!dev->cpumask) {
WARN_ON(num_possible_cpus() > 1);
dev->cpumask = cpumask_of(smp_processor_id());
@@ -445,11 +549,11 @@ int __clockevents_update_freq(struct clock_event_device *dev, u32 freq)
{
clockevents_config(dev, freq);

- if (dev->mode == CLOCK_EVT_MODE_ONESHOT)
+ if (dev->state == CLOCK_EVT_STATE_ONESHOT)
return clockevents_program_event(dev, dev->next_event, false);

- if (dev->mode == CLOCK_EVT_MODE_PERIODIC)
- dev->set_mode(CLOCK_EVT_MODE_PERIODIC, dev);
+ if (dev->state == CLOCK_EVT_STATE_PERIODIC)
+ return __clockevents_set_state(dev, CLOCK_EVT_STATE_PERIODIC);

return 0;
}
@@ -491,30 +595,27 @@ void clockevents_handle_noop(struct clock_event_device *dev)
* @old: device to release (can be NULL)
* @new: device to request (can be NULL)
*
- * Called from the notifier chain. clockevents_lock is held already
+ * Called from various tick functions with clockevents_lock held and
+ * interrupts disabled.
*/
void clockevents_exchange_device(struct clock_event_device *old,
struct clock_event_device *new)
{
- unsigned long flags;
-
- local_irq_save(flags);
/*
* Caller releases a clock event device. We queue it into the
* released list and do a notify add later.
*/
if (old) {
module_put(old->owner);
- clockevents_set_mode(old, CLOCK_EVT_MODE_UNUSED);
+ clockevents_set_state(old, CLOCK_EVT_STATE_DETACHED);
list_del(&old->list);
list_add(&old->list, &clockevents_released);
}

if (new) {
- BUG_ON(new->mode != CLOCK_EVT_MODE_UNUSED);
+ BUG_ON(new->state != CLOCK_EVT_STATE_DETACHED);
clockevents_shutdown(new);
}
- local_irq_restore(flags);
}

/**
@@ -541,74 +642,40 @@ void clockevents_resume(void)
dev->resume(dev);
}

-#ifdef CONFIG_GENERIC_CLOCKEVENTS
+#ifdef CONFIG_HOTPLUG_CPU
/**
- * clockevents_notify - notification about relevant events
- * Returns 0 on success, any other value on error
+ * tick_cleanup_dead_cpu - Cleanup the tick and clockevents of a dead cpu
*/
-int clockevents_notify(unsigned long reason, void *arg)
+void tick_cleanup_dead_cpu(int cpu)
{
struct clock_event_device *dev, *tmp;
unsigned long flags;
- int cpu, ret = 0;

raw_spin_lock_irqsave(&clockevents_lock, flags);

- switch (reason) {
- case CLOCK_EVT_NOTIFY_BROADCAST_ON:
- case CLOCK_EVT_NOTIFY_BROADCAST_OFF:
- case CLOCK_EVT_NOTIFY_BROADCAST_FORCE:
- tick_broadcast_on_off(reason, arg);
- break;
-
- case CLOCK_EVT_NOTIFY_BROADCAST_ENTER:
- case CLOCK_EVT_NOTIFY_BROADCAST_EXIT:
- ret = tick_broadcast_oneshot_control(reason);
- break;
-
- case CLOCK_EVT_NOTIFY_CPU_DYING:
- tick_handover_do_timer(arg);
- break;
-
- case CLOCK_EVT_NOTIFY_SUSPEND:
- tick_suspend();
- tick_suspend_broadcast();
- break;
-
- case CLOCK_EVT_NOTIFY_RESUME:
- tick_resume();
- break;
-
- case CLOCK_EVT_NOTIFY_CPU_DEAD:
- tick_shutdown_broadcast_oneshot(arg);
- tick_shutdown_broadcast(arg);
- tick_shutdown(arg);
- /*
- * Unregister the clock event devices which were
- * released from the users in the notify chain.
- */
- list_for_each_entry_safe(dev, tmp, &clockevents_released, list)
+ tick_shutdown_broadcast_oneshot(cpu);
+ tick_shutdown_broadcast(cpu);
+ tick_shutdown(cpu);
+ /*
+ * Unregister the clock event devices which were
+ * released from the users in the notify chain.
+ */
+ list_for_each_entry_safe(dev, tmp, &clockevents_released, list)
+ list_del(&dev->list);
+ /*
+ * Now check whether the CPU has left unused per cpu devices
+ */
+ list_for_each_entry_safe(dev, tmp, &clockevent_devices, list) {
+ if (cpumask_test_cpu(cpu, dev->cpumask) &&
+ cpumask_weight(dev->cpumask) == 1 &&
+ !tick_is_broadcast_device(dev)) {
+ BUG_ON(dev->state != CLOCK_EVT_STATE_DETACHED);
list_del(&dev->list);
- /*
- * Now check whether the CPU has left unused per cpu devices
- */
- cpu = *((int *)arg);
- list_for_each_entry_safe(dev, tmp, &clockevent_devices, list) {
- if (cpumask_test_cpu(cpu, dev->cpumask) &&
- cpumask_weight(dev->cpumask) == 1 &&
- !tick_is_broadcast_device(dev)) {
- BUG_ON(dev->mode != CLOCK_EVT_MODE_UNUSED);
- list_del(&dev->list);
- }
}
- break;
- default:
- break;
}
raw_spin_unlock_irqrestore(&clockevents_lock, flags);
- return ret;
}
-EXPORT_SYMBOL_GPL(clockevents_notify);
+#endif

#ifdef CONFIG_SYSFS
struct bus_type clockevents_subsys = {
@@ -727,5 +794,3 @@ static int __init clockevents_init_sysfs(void)
}
device_initcall(clockevents_init_sysfs);
#endif /* SYSFS */
-
-#endif /* GENERIC_CLOCK_EVENTS */
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 4892352f0e49..15facb1b9c60 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -142,13 +142,6 @@ static void __clocksource_unstable(struct clocksource *cs)
schedule_work(&watchdog_work);
}

-static void clocksource_unstable(struct clocksource *cs, int64_t delta)
-{
- printk(KERN_WARNING "Clocksource %s unstable (delta = %Ld ns)\n",
- cs->name, delta);
- __clocksource_unstable(cs);
-}
-
/**
* clocksource_mark_unstable - mark clocksource unstable via watchdog
* @cs: clocksource to be marked unstable
@@ -174,7 +167,7 @@ void clocksource_mark_unstable(struct clocksource *cs)
static void clocksource_watchdog(unsigned long data)
{
struct clocksource *cs;
- cycle_t csnow, wdnow, delta;
+ cycle_t csnow, wdnow, cslast, wdlast, delta;
int64_t wd_nsec, cs_nsec;
int next_cpu, reset_pending;

@@ -213,6 +206,8 @@ static void clocksource_watchdog(unsigned long data)

delta = clocksource_delta(csnow, cs->cs_last, cs->mask);
cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift);
+ wdlast = cs->wd_last; /* save these in case we print them */
+ cslast = cs->cs_last;
cs->cs_last = csnow;
cs->wd_last = wdnow;

@@ -221,7 +216,12 @@ static void clocksource_watchdog(unsigned long data)

/* Check the deviation from the watchdog clocksource. */
if ((abs(cs_nsec - wd_nsec) > WATCHDOG_THRESHOLD)) {
- clocksource_unstable(cs, cs_nsec - wd_nsec);
+ pr_warn("timekeeping watchdog: Marking clocksource '%s' as unstable, because the skew is too large:\n", cs->name);
+ pr_warn(" '%s' wd_now: %llx wd_last: %llx mask: %llx\n",
+ watchdog->name, wdnow, wdlast, watchdog->mask);
+ pr_warn(" '%s' cs_now: %llx cs_last: %llx mask: %llx\n",
+ cs->name, csnow, cslast, cs->mask);
+ __clocksource_unstable(cs);
continue;
}

@@ -469,26 +469,25 @@ static u32 clocksource_max_adjustment(struct clocksource *cs)
* @shift: cycle to nanosecond divisor (power of two)
* @maxadj: maximum adjustment value to mult (~11%)
* @mask: bitmask for two's complement subtraction of non 64 bit counters
+ * @max_cyc: maximum cycle value before potential overflow (does not include
+ * any safety margin)
+ *
+ * NOTE: This function includes a safety margin of 50%, in other words, we
+ * return half the number of nanoseconds the hardware counter can technically
+ * cover. This is done so that we can potentially detect problems caused by
+ * delayed timers or bad hardware, which might result in time intervals that
+ * are larger then what the math used can handle without overflows.
*/
-u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
+u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask, u64 *max_cyc)
{
u64 max_nsecs, max_cycles;

/*
* Calculate the maximum number of cycles that we can pass to the
- * cyc2ns function without overflowing a 64-bit signed result. The
- * maximum number of cycles is equal to ULLONG_MAX/(mult+maxadj)
- * which is equivalent to the below.
- * max_cycles < (2^63)/(mult + maxadj)
- * max_cycles < 2^(log2((2^63)/(mult + maxadj)))
- * max_cycles < 2^(log2(2^63) - log2(mult + maxadj))
- * max_cycles < 2^(63 - log2(mult + maxadj))
- * max_cycles < 1 << (63 - log2(mult + maxadj))
- * Please note that we add 1 to the result of the log2 to account for
- * any rounding errors, ensure the above inequality is satisfied and
- * no overflow will occur.
+ * cyc2ns() function without overflowing a 64-bit result.
*/
- max_cycles = 1ULL << (63 - (ilog2(mult + maxadj) + 1));
+ max_cycles = ULLONG_MAX;
+ do_div(max_cycles, mult+maxadj);

/*
* The actual maximum number of cycles we can defer the clocksource is
@@ -499,27 +498,26 @@ u64 clocks_calc_max_nsecs(u32 mult, u32 shift, u32 maxadj, u64 mask)
max_cycles = min(max_cycles, mask);
max_nsecs = clocksource_cyc2ns(max_cycles, mult - maxadj, shift);

+ /* return the max_cycles value as well if requested */
+ if (max_cyc)
+ *max_cyc = max_cycles;
+
+ /* Return 50% of the actual maximum, so we can detect bad values */
+ max_nsecs >>= 1;
+
return max_nsecs;
}

/**
- * clocksource_max_deferment - Returns max time the clocksource can be deferred
- * @cs: Pointer to clocksource
+ * clocksource_update_max_deferment - Updates the clocksource max_idle_ns & max_cycles
+ * @cs: Pointer to clocksource to be updated
*
*/
-static u64 clocksource_max_deferment(struct clocksource *cs)
+static inline void clocksource_update_max_deferment(struct clocksource *cs)
{
- u64 max_nsecs;
-
- max_nsecs = clocks_calc_max_nsecs(cs->mult, cs->shift, cs->maxadj,
- cs->mask);
- /*
- * To ensure that the clocksource does not wrap whilst we are idle,
- * limit the time the clocksource can be deferred by 12.5%. Please
- * note a margin of 12.5% is used because this can be computed with
- * a shift, versus say 10% which would require division.
- */
- return max_nsecs - (max_nsecs >> 3);
+ cs->max_idle_ns = clocks_calc_max_nsecs(cs->mult, cs->shift,
+ cs->maxadj, cs->mask,
+ &cs->max_cycles);
}

#ifndef CONFIG_ARCH_USES_GETTIMEOFFSET
@@ -648,7 +646,7 @@ static void clocksource_enqueue(struct clocksource *cs)
}

/**
- * __clocksource_updatefreq_scale - Used update clocksource with new freq
+ * __clocksource_update_freq_scale - Used update clocksource with new freq
* @cs: clocksource to be registered
* @scale: Scale factor multiplied against freq to get clocksource hz
* @freq: clocksource frequency (cycles per second) divided by scale
@@ -656,48 +654,64 @@ static void clocksource_enqueue(struct clocksource *cs)
* This should only be called from the clocksource->enable() method.
*
* This *SHOULD NOT* be called directly! Please use the
- * clocksource_updatefreq_hz() or clocksource_updatefreq_khz helper functions.
+ * __clocksource_update_freq_hz() or __clocksource_update_freq_khz() helper
+ * functions.
*/
-void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
+void __clocksource_update_freq_scale(struct clocksource *cs, u32 scale, u32 freq)
{
u64 sec;
+
/*
- * Calc the maximum number of seconds which we can run before
- * wrapping around. For clocksources which have a mask > 32bit
- * we need to limit the max sleep time to have a good
- * conversion precision. 10 minutes is still a reasonable
- * amount. That results in a shift value of 24 for a
- * clocksource with mask >= 40bit and f >= 4GHz. That maps to
- * ~ 0.06ppm granularity for NTP. We apply the same 12.5%
- * margin as we do in clocksource_max_deferment()
+ * Default clocksources are *special* and self-define their mult/shift.
+ * But, you're not special, so you should specify a freq value.
*/
- sec = (cs->mask - (cs->mask >> 3));
- do_div(sec, freq);
- do_div(sec, scale);
- if (!sec)
- sec = 1;
- else if (sec > 600 && cs->mask > UINT_MAX)
- sec = 600;
-
- clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
- NSEC_PER_SEC / scale, sec * scale);
-
+ if (freq) {
+ /*
+ * Calc the maximum number of seconds which we can run before
+ * wrapping around. For clocksources which have a mask > 32-bit
+ * we need to limit the max sleep time to have a good
+ * conversion precision. 10 minutes is still a reasonable
+ * amount. That results in a shift value of 24 for a
+ * clocksource with mask >= 40-bit and f >= 4GHz. That maps to
+ * ~ 0.06ppm granularity for NTP.
+ */
+ sec = cs->mask;
+ do_div(sec, freq);
+ do_div(sec, scale);
+ if (!sec)
+ sec = 1;
+ else if (sec > 600 && cs->mask > UINT_MAX)
+ sec = 600;
+
+ clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
+ NSEC_PER_SEC / scale, sec * scale);
+ }
/*
- * for clocksources that have large mults, to avoid overflow.
- * Since mult may be adjusted by ntp, add an safety extra margin
- *
+ * Ensure clocksources that have large 'mult' values don't overflow
+ * when adjusted.
*/
cs->maxadj = clocksource_max_adjustment(cs);
- while ((cs->mult + cs->maxadj < cs->mult)
- || (cs->mult - cs->maxadj > cs->mult)) {
+ while (freq && ((cs->mult + cs->maxadj < cs->mult)
+ || (cs->mult - cs->maxadj > cs->mult))) {
cs->mult >>= 1;
cs->shift--;
cs->maxadj = clocksource_max_adjustment(cs);
}

- cs->max_idle_ns = clocksource_max_deferment(cs);
+ /*
+ * Only warn for *special* clocksources that self-define
+ * their mult/shift values and don't specify a freq.
+ */
+ WARN_ONCE(cs->mult + cs->maxadj < cs->mult,
+ "timekeeping: Clocksource %s might overflow on 11%% adjustment\n",
+ cs->name);
+
+ clocksource_update_max_deferment(cs);
+
+ pr_info("clocksource %s: mask: 0x%llx max_cycles: 0x%llx, max_idle_ns: %lld ns\n",
+ cs->name, cs->mask, cs->max_cycles, cs->max_idle_ns);
}
-EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);
+EXPORT_SYMBOL_GPL(__clocksource_update_freq_scale);

/**
* __clocksource_register_scale - Used to install new clocksources
@@ -714,7 +728,7 @@ int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
{

/* Initialize mult/shift and max_idle_ns */
- __clocksource_updatefreq_scale(cs, scale, freq);
+ __clocksource_update_freq_scale(cs, scale, freq);

/* Add clocksource to the clocksource list */
mutex_lock(&clocksource_mutex);
@@ -726,33 +740,6 @@ int __clocksource_register_scale(struct clocksource *cs, u32 scale, u32 freq)
}
EXPORT_SYMBOL_GPL(__clocksource_register_scale);

-
-/**
- * clocksource_register - Used to install new clocksources
- * @cs: clocksource to be registered
- *
- * Returns -EBUSY if registration fails, zero otherwise.
- */
-int clocksource_register(struct clocksource *cs)
-{
- /* calculate max adjustment for given mult/shift */
- cs->maxadj = clocksource_max_adjustment(cs);
- WARN_ONCE(cs->mult + cs->maxadj < cs->mult,
- "Clocksource %s might overflow on 11%% adjustment\n",
- cs->name);
-
- /* calculate max idle time permitted for this clocksource */
- cs->max_idle_ns = clocksource_max_deferment(cs);
-
- mutex_lock(&clocksource_mutex);
- clocksource_enqueue(cs);
- clocksource_enqueue_watchdog(cs);
- clocksource_select();
- mutex_unlock(&clocksource_mutex);
- return 0;
-}
-EXPORT_SYMBOL(clocksource_register);
-
static void __clocksource_change_rating(struct clocksource *cs, int rating)
{
list_del(&cs->list);
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index bee0c1f78091..76d4bd962b19 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -54,7 +54,7 @@

#include <trace/events/timer.h>

-#include "timekeeping.h"
+#include "tick-internal.h"

/*
* The timer bases:
@@ -1707,17 +1707,10 @@ static int hrtimer_cpu_notify(struct notifier_block *self,
break;

#ifdef CONFIG_HOTPLUG_CPU
- case CPU_DYING:
- case CPU_DYING_FROZEN:
- clockevents_notify(CLOCK_EVT_NOTIFY_CPU_DYING, &scpu);
- break;
case CPU_DEAD:
case CPU_DEAD_FROZEN:
- {
- clockevents_notify(CLOCK_EVT_NOTIFY_CPU_DEAD, &scpu);
migrate_hrtimers(scpu);
break;
- }
#endif

default:
diff --git a/kernel/time/jiffies.c b/kernel/time/jiffies.c
index a6a5bf53e86d..347fecf86a3f 100644
--- a/kernel/time/jiffies.c
+++ b/kernel/time/jiffies.c
@@ -25,7 +25,7 @@
#include <linux/module.h>
#include <linux/init.h>

-#include "tick-internal.h"
+#include "timekeeping.h"

/* The Jiffies based clocksource is the lowest common
* denominator clock source which should function on
@@ -71,6 +71,7 @@ static struct clocksource clocksource_jiffies = {
.mask = 0xffffffff, /*32bits*/
.mult = NSEC_PER_JIFFY << JIFFIES_SHIFT, /* details above */
.shift = JIFFIES_SHIFT,
+ .max_cycles = 10,
};

__cacheline_aligned_in_smp DEFINE_SEQLOCK(jiffies_lock);
@@ -94,7 +95,7 @@ EXPORT_SYMBOL(jiffies);

static int __init init_jiffies_clocksource(void)
{
- return clocksource_register(&clocksource_jiffies);
+ return __clocksource_register(&clocksource_jiffies);
}

core_initcall(init_jiffies_clocksource);
@@ -130,6 +131,6 @@ int register_refined_jiffies(long cycles_per_second)

refined_jiffies.mult = ((u32)nsec_per_tick) << JIFFIES_SHIFT;

- clocksource_register(&refined_jiffies);
+ __clocksource_register(&refined_jiffies);
return 0;
}
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index 0f60b08a4f07..7a681003001c 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -17,7 +17,6 @@
#include <linux/module.h>
#include <linux/rtc.h>

-#include "tick-internal.h"
#include "ntp_internal.h"

/*
@@ -459,6 +458,16 @@ int second_overflow(unsigned long secs)
return leap;
}

+#ifdef CONFIG_GENERIC_CMOS_UPDATE
+int __weak update_persistent_clock64(struct timespec64 now64)
+{
+ struct timespec now;
+
+ now = timespec64_to_timespec(now64);
+ return update_persistent_clock(now);
+}
+#endif
+
#if defined(CONFIG_GENERIC_CMOS_UPDATE) || defined(CONFIG_RTC_SYSTOHC)
static void sync_cmos_clock(struct work_struct *work);

@@ -494,8 +503,9 @@ static void sync_cmos_clock(struct work_struct *work)
if (persistent_clock_is_local)
adjust.tv_sec -= (sys_tz.tz_minuteswest * 60);
#ifdef CONFIG_GENERIC_CMOS_UPDATE
- fail = update_persistent_clock(timespec64_to_timespec(adjust));
+ fail = update_persistent_clock64(adjust);
#endif
+
#ifdef CONFIG_RTC_SYSTOHC
if (fail == -ENODEV)
fail = rtc_set_ntp_time(adjust);
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 01d2d15aa662..a26036d37a38 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -1,5 +1,6 @@
/*
- * sched_clock.c: support for extending counters to full 64-bit ns counter
+ * sched_clock.c: Generic sched_clock() support, to extend low level
+ * hardware time counters to full 64-bit ns values.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
@@ -18,15 +19,53 @@
#include <linux/seqlock.h>
#include <linux/bitops.h>

-struct clock_data {
- ktime_t wrap_kt;
+/**
+ * struct clock_read_data - data required to read from sched_clock()
+ *
+ * @epoch_ns: sched_clock() value at last update
+ * @epoch_cyc: Clock cycle value at last update.
+ * @sched_clock_mask: Bitmask for two's complement subtraction of non 64bit
+ * clocks.
+ * @read_sched_clock: Current clock source (or dummy source when suspended).
+ * @mult: Multipler for scaled math conversion.
+ * @shift: Shift value for scaled math conversion.
+ *
+ * Care must be taken when updating this structure; it is read by
+ * some very hot code paths. It occupies <=40 bytes and, when combined
+ * with the seqcount used to synchronize access, comfortably fits into
+ * a 64 byte cache line.
+ */
+struct clock_read_data {
u64 epoch_ns;
u64 epoch_cyc;
- seqcount_t seq;
- unsigned long rate;
+ u64 sched_clock_mask;
+ u64 (*read_sched_clock)(void);
u32 mult;
u32 shift;
- bool suspended;
+};
+
+/**
+ * struct clock_data - all data needed for sched_clock() (including
+ * registration of a new clock source)
+ *
+ * @seq: Sequence counter for protecting updates. The lowest
+ * bit is the index for @read_data.
+ * @read_data: Data required to read from sched_clock.
+ * @wrap_kt: Duration for which clock can run before wrapping.
+ * @rate: Tick rate of the registered clock.
+ * @actual_read_sched_clock: Registered hardware level clock read function.
+ *
+ * The ordering of this structure has been chosen to optimize cache
+ * performance. In particular 'seq' and 'read_data[0]' (combined) should fit
+ * into a single 64-byte cache line.
+ */
+struct clock_data {
+ seqcount_t seq;
+ struct clock_read_data read_data[2];
+ ktime_t wrap_kt;
+ unsigned long rate;
+
+ u64 (*actual_read_sched_clock)(void);
};

static struct hrtimer sched_clock_timer;
@@ -34,12 +73,6 @@ static int irqtime = -1;

core_param(irqtime, irqtime, int, 0400);

-static struct clock_data cd = {
- .mult = NSEC_PER_SEC / HZ,
-};
-
-static u64 __read_mostly sched_clock_mask;
-
static u64 notrace jiffy_sched_clock_read(void)
{
/*
@@ -49,7 +82,11 @@ static u64 notrace jiffy_sched_clock_read(void)
return (u64)(jiffies - INITIAL_JIFFIES);
}

-static u64 __read_mostly (*read_sched_clock)(void) = jiffy_sched_clock_read;
+static struct clock_data cd ____cacheline_aligned = {
+ .read_data[0] = { .mult = NSEC_PER_SEC / HZ,
+ .read_sched_clock = jiffy_sched_clock_read, },
+ .actual_read_sched_clock = jiffy_sched_clock_read,
+};

static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
{
@@ -58,111 +95,136 @@ static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)

unsigned long long notrace sched_clock(void)
{
- u64 epoch_ns;
- u64 epoch_cyc;
- u64 cyc;
+ u64 cyc, res;
unsigned long seq;
-
- if (cd.suspended)
- return cd.epoch_ns;
+ struct clock_read_data *rd;

do {
- seq = raw_read_seqcount_begin(&cd.seq);
- epoch_cyc = cd.epoch_cyc;
- epoch_ns = cd.epoch_ns;
+ seq = raw_read_seqcount(&cd.seq);
+ rd = cd.read_data + (seq & 1);
+
+ cyc = (rd->read_sched_clock() - rd->epoch_cyc) &
+ rd->sched_clock_mask;
+ res = rd->epoch_ns + cyc_to_ns(cyc, rd->mult, rd->shift);
} while (read_seqcount_retry(&cd.seq, seq));

- cyc = read_sched_clock();
- cyc = (cyc - epoch_cyc) & sched_clock_mask;
- return epoch_ns + cyc_to_ns(cyc, cd.mult, cd.shift);
+ return res;
+}
+
+/*
+ * Updating the data required to read the clock.
+ *
+ * sched_clock() will never observe mis-matched data even if called from
+ * an NMI. We do this by maintaining an odd/even copy of the data and
+ * steering sched_clock() to one or the other using a sequence counter.
+ * In order to preserve the data cache profile of sched_clock() as much
+ * as possible the system reverts back to the even copy when the update
+ * completes; the odd copy is used *only* during an update.
+ */
+static void update_clock_read_data(struct clock_read_data *rd)
+{
+ /* update the backup (odd) copy with the new data */
+ cd.read_data[1] = *rd;
+
+ /* steer readers towards the odd copy */
+ raw_write_seqcount_latch(&cd.seq);
+
+ /* now its safe for us to update the normal (even) copy */
+ cd.read_data[0] = *rd;
+
+ /* switch readers back to the even copy */
+ raw_write_seqcount_latch(&cd.seq);
}

/*
- * Atomically update the sched_clock epoch.
+ * Atomically update the sched_clock() epoch.
*/
-static void notrace update_sched_clock(void)
+static void update_sched_clock(void)
{
- unsigned long flags;
u64 cyc;
u64 ns;
+ struct clock_read_data rd;
+
+ rd = cd.read_data[0];
+
+ cyc = cd.actual_read_sched_clock();
+ ns = rd.epoch_ns + cyc_to_ns((cyc - rd.epoch_cyc) & rd.sched_clock_mask, rd.mult, rd.shift);
+
+ rd.epoch_ns = ns;
+ rd.epoch_cyc = cyc;

- cyc = read_sched_clock();
- ns = cd.epoch_ns +
- cyc_to_ns((cyc - cd.epoch_cyc) & sched_clock_mask,
- cd.mult, cd.shift);
-
- raw_local_irq_save(flags);
- raw_write_seqcount_begin(&cd.seq);
- cd.epoch_ns = ns;
- cd.epoch_cyc = cyc;
- raw_write_seqcount_end(&cd.seq);
- raw_local_irq_restore(flags);
+ update_clock_read_data(&rd);
}

static enum hrtimer_restart sched_clock_poll(struct hrtimer *hrt)
{
update_sched_clock();
hrtimer_forward_now(hrt, cd.wrap_kt);
+
return HRTIMER_RESTART;
}

-void __init sched_clock_register(u64 (*read)(void), int bits,
- unsigned long rate)
+void __init
+sched_clock_register(u64 (*read)(void), int bits, unsigned long rate)
{
u64 res, wrap, new_mask, new_epoch, cyc, ns;
u32 new_mult, new_shift;
- ktime_t new_wrap_kt;
unsigned long r;
char r_unit;
+ struct clock_read_data rd;

if (cd.rate > rate)
return;

WARN_ON(!irqs_disabled());

- /* calculate the mult/shift to convert counter ticks to ns. */
+ /* Calculate the mult/shift to convert counter ticks to ns. */
clocks_calc_mult_shift(&new_mult, &new_shift, rate, NSEC_PER_SEC, 3600);

new_mask = CLOCKSOURCE_MASK(bits);
+ cd.rate = rate;
+
+ /* Calculate how many nanosecs until we risk wrapping */
+ wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask, NULL);
+ cd.wrap_kt = ns_to_ktime(wrap);

- /* calculate how many ns until we wrap */
- wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask);
- new_wrap_kt = ns_to_ktime(wrap - (wrap >> 3));
+ rd = cd.read_data[0];

- /* update epoch for new counter and update epoch_ns from old counter*/
+ /* Update epoch for new counter and update 'epoch_ns' from old counter*/
new_epoch = read();
- cyc = read_sched_clock();
- ns = cd.epoch_ns + cyc_to_ns((cyc - cd.epoch_cyc) & sched_clock_mask,
- cd.mult, cd.shift);
+ cyc = cd.actual_read_sched_clock();
+ ns = rd.epoch_ns + cyc_to_ns((cyc - rd.epoch_cyc) & rd.sched_clock_mask, rd.mult, rd.shift);
+ cd.actual_read_sched_clock = read;

- raw_write_seqcount_begin(&cd.seq);
- read_sched_clock = read;
- sched_clock_mask = new_mask;
- cd.rate = rate;
- cd.wrap_kt = new_wrap_kt;
- cd.mult = new_mult;
- cd.shift = new_shift;
- cd.epoch_cyc = new_epoch;
- cd.epoch_ns = ns;
- raw_write_seqcount_end(&cd.seq);
+ rd.read_sched_clock = read;
+ rd.sched_clock_mask = new_mask;
+ rd.mult = new_mult;
+ rd.shift = new_shift;
+ rd.epoch_cyc = new_epoch;
+ rd.epoch_ns = ns;
+
+ update_clock_read_data(&rd);

r = rate;
if (r >= 4000000) {
r /= 1000000;
r_unit = 'M';
- } else if (r >= 1000) {
- r /= 1000;
- r_unit = 'k';
- } else
- r_unit = ' ';
-
- /* calculate the ns resolution of this counter */
+ } else {
+ if (r >= 1000) {
+ r /= 1000;
+ r_unit = 'k';
+ } else {
+ r_unit = ' ';
+ }
+ }
+
+ /* Calculate the ns resolution of this counter */
res = cyc_to_ns(1ULL, new_mult, new_shift);

pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n",
bits, r, r_unit, res, wrap);

- /* Enable IRQ time accounting if we have a fast enough sched_clock */
+ /* Enable IRQ time accounting if we have a fast enough sched_clock() */
if (irqtime > 0 || (irqtime == -1 && rate >= 1000000))
enable_sched_clock_irqtime();

@@ -172,10 +234,10 @@ void __init sched_clock_register(u64 (*read)(void), int bits,
void __init sched_clock_postinit(void)
{
/*
- * If no sched_clock function has been provided at that point,
+ * If no sched_clock() function has been provided at that point,
* make it the final one one.
*/
- if (read_sched_clock == jiffy_sched_clock_read)
+ if (cd.actual_read_sched_clock == jiffy_sched_clock_read)
sched_clock_register(jiffy_sched_clock_read, BITS_PER_LONG, HZ);

update_sched_clock();
@@ -189,29 +251,53 @@ void __init sched_clock_postinit(void)
hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
}

+/*
+ * Clock read function for use when the clock is suspended.
+ *
+ * This function makes it appear to sched_clock() as if the clock
+ * stopped counting at its last update.
+ *
+ * This function must only be called from the critical
+ * section in sched_clock(). It relies on the read_seqcount_retry()
+ * at the end of the critical section to be sure we observe the
+ * correct copy of 'epoch_cyc'.
+ */
+static u64 notrace suspended_sched_clock_read(void)
+{
+ unsigned long seq = raw_read_seqcount(&cd.seq);
+
+ return cd.read_data[seq & 1].epoch_cyc;
+}
+
static int sched_clock_suspend(void)
{
+ struct clock_read_data *rd = &cd.read_data[0];
+
update_sched_clock();
hrtimer_cancel(&sched_clock_timer);
- cd.suspended = true;
+ rd->read_sched_clock = suspended_sched_clock_read;
+
return 0;
}

static void sched_clock_resume(void)
{
- cd.epoch_cyc = read_sched_clock();
+ struct clock_read_data *rd = &cd.read_data[0];
+
+ rd->epoch_cyc = cd.actual_read_sched_clock();
hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
- cd.suspended = false;
+ rd->read_sched_clock = cd.actual_read_sched_clock;
}

static struct syscore_ops sched_clock_ops = {
- .suspend = sched_clock_suspend,
- .resume = sched_clock_resume,
+ .suspend = sched_clock_suspend,
+ .resume = sched_clock_resume,
};

static int __init sched_clock_syscore_init(void)
{
register_syscore_ops(&sched_clock_ops);
+
return 0;
}
device_initcall(sched_clock_syscore_init);
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 066f0ec05e48..7e8ca4f448a8 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -33,12 +33,14 @@ static cpumask_var_t tick_broadcast_mask;
static cpumask_var_t tick_broadcast_on;
static cpumask_var_t tmpmask;
static DEFINE_RAW_SPINLOCK(tick_broadcast_lock);
-static int tick_broadcast_force;
+static int tick_broadcast_forced;

#ifdef CONFIG_TICK_ONESHOT
static void tick_broadcast_clear_oneshot(int cpu);
+static void tick_resume_broadcast_oneshot(struct clock_event_device *bc);
#else
static inline void tick_broadcast_clear_oneshot(int cpu) { }
+static inline void tick_resume_broadcast_oneshot(struct clock_event_device *bc) { }
#endif

/*
@@ -303,7 +305,7 @@ static void tick_handle_periodic_broadcast(struct clock_event_device *dev)
/*
* The device is in periodic mode. No reprogramming necessary:
*/
- if (dev->mode == CLOCK_EVT_MODE_PERIODIC)
+ if (dev->state == CLOCK_EVT_STATE_PERIODIC)
goto unlock;

/*
@@ -324,49 +326,54 @@ static void tick_handle_periodic_broadcast(struct clock_event_device *dev)
raw_spin_unlock(&tick_broadcast_lock);
}

-/*
- * Powerstate information: The system enters/leaves a state, where
- * affected devices might stop
+/**
+ * tick_broadcast_control - Enable/disable or force broadcast mode
+ * @mode: The selected broadcast mode
+ *
+ * Called when the system enters a state where affected tick devices
+ * might stop. Note: TICK_BROADCAST_FORCE cannot be undone.
+ *
+ * Called with interrupts disabled, so clockevents_lock is not
+ * required here because the local clock event device cannot go away
+ * under us.
*/
-static void tick_do_broadcast_on_off(unsigned long *reason)
+void tick_broadcast_control(enum tick_broadcast_mode mode)
{
struct clock_event_device *bc, *dev;
struct tick_device *td;
- unsigned long flags;
int cpu, bc_stopped;

- raw_spin_lock_irqsave(&tick_broadcast_lock, flags);
-
- cpu = smp_processor_id();
- td = &per_cpu(tick_cpu_device, cpu);
+ td = this_cpu_ptr(&tick_cpu_device);
dev = td->evtdev;
- bc = tick_broadcast_device.evtdev;

/*
* Is the device not affected by the powerstate ?
*/
if (!dev || !(dev->features & CLOCK_EVT_FEAT_C3STOP))
- goto out;
+ return;

if (!tick_device_is_functional(dev))
- goto out;
+ return;

+ raw_spin_lock(&tick_broadcast_lock);
+ cpu = smp_processor_id();
+ bc = tick_broadcast_device.evtdev;
bc_stopped = cpumask_empty(tick_broadcast_mask);

- switch (*reason) {
- case CLOCK_EVT_NOTIFY_BROADCAST_ON:
- case CLOCK_EVT_NOTIFY_BROADCAST_FORCE:
+ switch (mode) {
+ case TICK_BROADCAST_FORCE:
+ tick_broadcast_forced = 1;
+ case TICK_BROADCAST_ON:
cpumask_set_cpu(cpu, tick_broadcast_on);
if (!cpumask_test_and_set_cpu(cpu, tick_broadcast_mask)) {
if (tick_broadcast_device.mode ==
TICKDEV_MODE_PERIODIC)
clockevents_shutdown(dev);
}
- if (*reason == CLOCK_EVT_NOTIFY_BROADCAST_FORCE)
- tick_broadcast_force = 1;
break;
- case CLOCK_EVT_NOTIFY_BROADCAST_OFF:
- if (tick_broadcast_force)
+
+ case TICK_BROADCAST_OFF:
+ if (tick_broadcast_forced)
break;
cpumask_clear_cpu(cpu, tick_broadcast_on);
if (!tick_device_is_functional(dev))
@@ -388,22 +395,9 @@ static void tick_do_broadcast_on_off(unsigned long *reason)
else
tick_broadcast_setup_oneshot(bc);
}
-out:
- raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
-}
-
-/*
- * Powerstate information: The system enters/leaves a state, where
- * affected devices might stop.
- */
-void tick_broadcast_on_off(unsigned long reason, int *oncpu)
-{
- if (!cpumask_test_cpu(*oncpu, cpu_online_mask))
- printk(KERN_ERR "tick-broadcast: ignoring broadcast for "
- "offline CPU #%d\n", *oncpu);
- else
- tick_do_broadcast_on_off(&reason);
+ raw_spin_unlock(&tick_broadcast_lock);
}
+EXPORT_SYMBOL_GPL(tick_broadcast_control);

/*
* Set the periodic handler depending on broadcast on/off
@@ -416,14 +410,14 @@ void tick_set_periodic_handler(struct clock_event_device *dev, int broadcast)
dev->event_handler = tick_handle_periodic_broadcast;
}

+#ifdef CONFIG_HOTPLUG_CPU
/*
* Remove a CPU from broadcasting
*/
-void tick_shutdown_broadcast(unsigned int *cpup)
+void tick_shutdown_broadcast(unsigned int cpu)
{
struct clock_event_device *bc;
unsigned long flags;
- unsigned int cpu = *cpup;

raw_spin_lock_irqsave(&tick_broadcast_lock, flags);

@@ -438,6 +432,7 @@ void tick_shutdown_broadcast(unsigned int *cpup)

raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
}
+#endif

void tick_suspend_broadcast(void)
{
@@ -453,38 +448,48 @@ void tick_suspend_broadcast(void)
raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
}

-int tick_resume_broadcast(void)
+/*
+ * This is called from tick_resume_local() on a resuming CPU. That's
+ * called from the core resume function, tick_unfreeze() and the magic XEN
+ * resume hackery.
+ *
+ * In none of these cases the broadcast device mode can change and the
+ * bit of the resuming CPU in the broadcast mask is safe as well.
+ */
+bool tick_resume_check_broadcast(void)
+{
+ if (tick_broadcast_device.mode == TICKDEV_MODE_ONESHOT)
+ return false;
+ else
+ return cpumask_test_cpu(smp_processor_id(), tick_broadcast_mask);
+}
+
+void tick_resume_broadcast(void)
{
struct clock_event_device *bc;
unsigned long flags;
- int broadcast = 0;

raw_spin_lock_irqsave(&tick_broadcast_lock, flags);

bc = tick_broadcast_device.evtdev;

if (bc) {
- clockevents_set_mode(bc, CLOCK_EVT_MODE_RESUME);
+ clockevents_tick_resume(bc);

switch (tick_broadcast_device.mode) {
case TICKDEV_MODE_PERIODIC:
if (!cpumask_empty(tick_broadcast_mask))
tick_broadcast_start_periodic(bc);
- broadcast = cpumask_test_cpu(smp_processor_id(),
- tick_broadcast_mask);
break;
case TICKDEV_MODE_ONESHOT:
if (!cpumask_empty(tick_broadcast_mask))
- broadcast = tick_resume_broadcast_oneshot(bc);
+ tick_resume_broadcast_oneshot(bc);
break;
}
}
raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
-
- return broadcast;
}

-
#ifdef CONFIG_TICK_ONESHOT

static cpumask_var_t tick_broadcast_oneshot_mask;
@@ -532,8 +537,8 @@ static int tick_broadcast_set_event(struct clock_event_device *bc, int cpu,
{
int ret;

- if (bc->mode != CLOCK_EVT_MODE_ONESHOT)
- clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);
+ if (bc->state != CLOCK_EVT_STATE_ONESHOT)
+ clockevents_set_state(bc, CLOCK_EVT_STATE_ONESHOT);

ret = clockevents_program_event(bc, expires, force);
if (!ret)
@@ -541,10 +546,9 @@ static int tick_broadcast_set_event(struct clock_event_device *bc, int cpu,
return ret;
}

-int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
+static void tick_resume_broadcast_oneshot(struct clock_event_device *bc)
{
- clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);
- return 0;
+ clockevents_set_state(bc, CLOCK_EVT_STATE_ONESHOT);
}

/*
@@ -562,8 +566,8 @@ void tick_check_oneshot_broadcast_this_cpu(void)
* switched over, leave the device alone.
*/
if (td->mode == TICKDEV_MODE_ONESHOT) {
- clockevents_set_mode(td->evtdev,
- CLOCK_EVT_MODE_ONESHOT);
+ clockevents_set_state(td->evtdev,
+ CLOCK_EVT_STATE_ONESHOT);
}
}
}
@@ -666,31 +670,26 @@ static void broadcast_shutdown_local(struct clock_event_device *bc,
if (dev->next_event.tv64 < bc->next_event.tv64)
return;
}
- clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
+ clockevents_set_state(dev, CLOCK_EVT_STATE_SHUTDOWN);
}

-static void broadcast_move_bc(int deadcpu)
-{
- struct clock_event_device *bc = tick_broadcast_device.evtdev;
-
- if (!bc || !broadcast_needs_cpu(bc, deadcpu))
- return;
- /* This moves the broadcast assignment to this cpu */
- clockevents_program_event(bc, bc->next_event, 1);
-}
-
-/*
- * Powerstate information: The system enters/leaves a state, where
- * affected devices might stop
+/**
+ * tick_broadcast_oneshot_control - Enter/exit broadcast oneshot mode
+ * @state: The target state (enter/exit)
+ *
+ * The system enters/leaves a state, where affected devices might stop
* Returns 0 on success, -EBUSY if the cpu is used to broadcast wakeups.
+ *
+ * Called with interrupts disabled, so clockevents_lock is not
+ * required here because the local clock event device cannot go away
+ * under us.
*/
-int tick_broadcast_oneshot_control(unsigned long reason)
+int tick_broadcast_oneshot_control(enum tick_broadcast_state state)
{
struct clock_event_device *bc, *dev;
struct tick_device *td;
- unsigned long flags;
- ktime_t now;
int cpu, ret = 0;
+ ktime_t now;

/*
* Periodic mode does not care about the enter/exit of power
@@ -703,17 +702,17 @@ int tick_broadcast_oneshot_control(unsigned long reason)
* We are called with preemtion disabled from the depth of the
* idle code, so we can't be moved away.
*/
- cpu = smp_processor_id();
- td = &per_cpu(tick_cpu_device, cpu);
+ td = this_cpu_ptr(&tick_cpu_device);
dev = td->evtdev;

if (!(dev->features & CLOCK_EVT_FEAT_C3STOP))
return 0;

+ raw_spin_lock(&tick_broadcast_lock);
bc = tick_broadcast_device.evtdev;
+ cpu = smp_processor_id();

- raw_spin_lock_irqsave(&tick_broadcast_lock, flags);
- if (reason == CLOCK_EVT_NOTIFY_BROADCAST_ENTER) {
+ if (state == TICK_BROADCAST_ENTER) {
if (!cpumask_test_and_set_cpu(cpu, tick_broadcast_oneshot_mask)) {
WARN_ON_ONCE(cpumask_test_cpu(cpu, tick_broadcast_pending_mask));
broadcast_shutdown_local(bc, dev);
@@ -741,7 +740,7 @@ int tick_broadcast_oneshot_control(unsigned long reason)
cpumask_clear_cpu(cpu, tick_broadcast_oneshot_mask);
} else {
if (cpumask_test_and_clear_cpu(cpu, tick_broadcast_oneshot_mask)) {
- clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
+ clockevents_set_state(dev, CLOCK_EVT_STATE_ONESHOT);
/*
* The cpu which was handling the broadcast
* timer marked this cpu in the broadcast
@@ -805,9 +804,10 @@ int tick_broadcast_oneshot_control(unsigned long reason)
}
}
out:
- raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
+ raw_spin_unlock(&tick_broadcast_lock);
return ret;
}
+EXPORT_SYMBOL_GPL(tick_broadcast_oneshot_control);

/*
* Reset the one shot broadcast for a cpu
@@ -842,7 +842,7 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc)

/* Set it up only once ! */
if (bc->event_handler != tick_handle_oneshot_broadcast) {
- int was_periodic = bc->mode == CLOCK_EVT_MODE_PERIODIC;
+ int was_periodic = bc->state == CLOCK_EVT_STATE_PERIODIC;

bc->event_handler = tick_handle_oneshot_broadcast;

@@ -858,7 +858,7 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
tick_broadcast_oneshot_mask, tmpmask);

if (was_periodic && !cpumask_empty(tmpmask)) {
- clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);
+ clockevents_set_state(bc, CLOCK_EVT_STATE_ONESHOT);
tick_broadcast_init_next_event(tmpmask,
tick_next_period);
tick_broadcast_set_event(bc, cpu, tick_next_period, 1);
@@ -894,14 +894,28 @@ void tick_broadcast_switch_to_oneshot(void)
raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
}

+#ifdef CONFIG_HOTPLUG_CPU
+void hotplug_cpu__broadcast_tick_pull(int deadcpu)
+{
+ struct clock_event_device *bc;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&tick_broadcast_lock, flags);
+ bc = tick_broadcast_device.evtdev;
+
+ if (bc && broadcast_needs_cpu(bc, deadcpu)) {
+ /* This moves the broadcast assignment to this CPU: */
+ clockevents_program_event(bc, bc->next_event, 1);
+ }
+ raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
+}

/*
* Remove a dead CPU from broadcasting
*/
-void tick_shutdown_broadcast_oneshot(unsigned int *cpup)
+void tick_shutdown_broadcast_oneshot(unsigned int cpu)
{
unsigned long flags;
- unsigned int cpu = *cpup;

raw_spin_lock_irqsave(&tick_broadcast_lock, flags);

@@ -913,10 +927,9 @@ void tick_shutdown_broadcast_oneshot(unsigned int *cpup)
cpumask_clear_cpu(cpu, tick_broadcast_pending_mask);
cpumask_clear_cpu(cpu, tick_broadcast_force_mask);

- broadcast_move_bc(cpu);
-
raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
}
+#endif

/*
* Check, whether the broadcast device is in one shot mode
diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index f7c515595b42..3ae6afa1eb98 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -102,7 +102,7 @@ void tick_handle_periodic(struct clock_event_device *dev)

tick_periodic(cpu);

- if (dev->mode != CLOCK_EVT_MODE_ONESHOT)
+ if (dev->state != CLOCK_EVT_STATE_ONESHOT)
return;
for (;;) {
/*
@@ -140,7 +140,7 @@ void tick_setup_periodic(struct clock_event_device *dev, int broadcast)

if ((dev->features & CLOCK_EVT_FEAT_PERIODIC) &&
!tick_broadcast_oneshot_active()) {
- clockevents_set_mode(dev, CLOCK_EVT_MODE_PERIODIC);
+ clockevents_set_state(dev, CLOCK_EVT_STATE_PERIODIC);
} else {
unsigned long seq;
ktime_t next;
@@ -150,7 +150,7 @@ void tick_setup_periodic(struct clock_event_device *dev, int broadcast)
next = tick_next_period;
} while (read_seqretry(&jiffies_lock, seq));

- clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
+ clockevents_set_state(dev, CLOCK_EVT_STATE_ONESHOT);

for (;;) {
if (!clockevents_program_event(dev, next, false))
@@ -332,14 +332,16 @@ void tick_check_new_device(struct clock_event_device *newdev)
tick_install_broadcast_device(newdev);
}

+#ifdef CONFIG_HOTPLUG_CPU
/*
* Transfer the do_timer job away from a dying cpu.
*
- * Called with interrupts disabled.
+ * Called with interrupts disabled. Not locking required. If
+ * tick_do_timer_cpu is owned by this cpu, nothing can change it.
*/
-void tick_handover_do_timer(int *cpup)
+void tick_handover_do_timer(void)
{
- if (*cpup == tick_do_timer_cpu) {
+ if (tick_do_timer_cpu == smp_processor_id()) {
int cpu = cpumask_first(cpu_online_mask);

tick_do_timer_cpu = (cpu < nr_cpu_ids) ? cpu :
@@ -354,9 +356,9 @@ void tick_handover_do_timer(int *cpup)
* access the hardware device itself.
* We just set the mode and remove it from the lists.
*/
-void tick_shutdown(unsigned int *cpup)
+void tick_shutdown(unsigned int cpu)
{
- struct tick_device *td = &per_cpu(tick_cpu_device, *cpup);
+ struct tick_device *td = &per_cpu(tick_cpu_device, cpu);
struct clock_event_device *dev = td->evtdev;

td->mode = TICKDEV_MODE_PERIODIC;
@@ -365,27 +367,42 @@ void tick_shutdown(unsigned int *cpup)
* Prevent that the clock events layer tries to call
* the set mode function!
*/
+ dev->state = CLOCK_EVT_STATE_DETACHED;
dev->mode = CLOCK_EVT_MODE_UNUSED;
clockevents_exchange_device(dev, NULL);
dev->event_handler = clockevents_handle_noop;
td->evtdev = NULL;
}
}
+#endif

-void tick_suspend(void)
+/**
+ * tick_suspend_local - Suspend the local tick device
+ *
+ * Called from the local cpu for freeze with interrupts disabled.
+ *
+ * No locks required. Nothing can change the per cpu device.
+ */
+void tick_suspend_local(void)
{
struct tick_device *td = this_cpu_ptr(&tick_cpu_device);

clockevents_shutdown(td->evtdev);
}

-void tick_resume(void)
+/**
+ * tick_resume_local - Resume the local tick device
+ *
+ * Called from the local CPU for unfreeze or XEN resume magic.
+ *
+ * No locks required. Nothing can change the per cpu device.
+ */
+void tick_resume_local(void)
{
struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
- int broadcast = tick_resume_broadcast();
-
- clockevents_set_mode(td->evtdev, CLOCK_EVT_MODE_RESUME);
+ bool broadcast = tick_resume_check_broadcast();

+ clockevents_tick_resume(td->evtdev);
if (!broadcast) {
if (td->mode == TICKDEV_MODE_PERIODIC)
tick_setup_periodic(td->evtdev, 0);
@@ -394,6 +411,35 @@ void tick_resume(void)
}
}

+/**
+ * tick_suspend - Suspend the tick and the broadcast device
+ *
+ * Called from syscore_suspend() via timekeeping_suspend with only one
+ * CPU online and interrupts disabled or from tick_unfreeze() under
+ * tick_freeze_lock.
+ *
+ * No locks required. Nothing can change the per cpu device.
+ */
+void tick_suspend(void)
+{
+ tick_suspend_local();
+ tick_suspend_broadcast();
+}
+
+/**
+ * tick_resume - Resume the tick and the broadcast device
+ *
+ * Called from syscore_resume() via timekeeping_resume with only one
+ * CPU online and interrupts disabled.
+ *
+ * No locks required. Nothing can change the per cpu device.
+ */
+void tick_resume(void)
+{
+ tick_resume_broadcast();
+ tick_resume_local();
+}
+
static DEFINE_RAW_SPINLOCK(tick_freeze_lock);
static unsigned int tick_freeze_depth;

@@ -411,12 +457,10 @@ void tick_freeze(void)
raw_spin_lock(&tick_freeze_lock);

tick_freeze_depth++;
- if (tick_freeze_depth == num_online_cpus()) {
+ if (tick_freeze_depth == num_online_cpus())
timekeeping_suspend();
- } else {
- tick_suspend();
- tick_suspend_broadcast();
- }
+ else
+ tick_suspend_local();

raw_spin_unlock(&tick_freeze_lock);
}
@@ -437,7 +481,7 @@ void tick_unfreeze(void)
if (tick_freeze_depth == num_online_cpus())
timekeeping_resume();
else
- tick_resume();
+ tick_resume_local();

tick_freeze_depth--;

diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index 366aeb4f2c66..b64fdd8054c5 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -5,15 +5,12 @@
#include <linux/tick.h>

#include "timekeeping.h"
+#include "tick-sched.h"

-extern seqlock_t jiffies_lock;
+#ifdef CONFIG_GENERIC_CLOCKEVENTS

-#define CS_NAME_LEN 32
-
-#ifdef CONFIG_GENERIC_CLOCKEVENTS_BUILD
-
-#define TICK_DO_TIMER_NONE -1
-#define TICK_DO_TIMER_BOOT -2
+# define TICK_DO_TIMER_NONE -1
+# define TICK_DO_TIMER_BOOT -2

DECLARE_PER_CPU(struct tick_device, tick_cpu_device);
extern ktime_t tick_next_period;
@@ -23,21 +20,72 @@ extern int tick_do_timer_cpu __read_mostly;
extern void tick_setup_periodic(struct clock_event_device *dev, int broadcast);
extern void tick_handle_periodic(struct clock_event_device *dev);
extern void tick_check_new_device(struct clock_event_device *dev);
-extern void tick_handover_do_timer(int *cpup);
-extern void tick_shutdown(unsigned int *cpup);
+extern void tick_shutdown(unsigned int cpu);
extern void tick_suspend(void);
extern void tick_resume(void);
extern bool tick_check_replacement(struct clock_event_device *curdev,
struct clock_event_device *newdev);
extern void tick_install_replacement(struct clock_event_device *dev);
+extern int tick_is_oneshot_available(void);
+extern struct tick_device *tick_get_device(int cpu);

-extern void clockevents_shutdown(struct clock_event_device *dev);
+extern int clockevents_tick_resume(struct clock_event_device *dev);
+/* Check, if the device is functional or a dummy for broadcast */
+static inline int tick_device_is_functional(struct clock_event_device *dev)
+{
+ return !(dev->features & CLOCK_EVT_FEAT_DUMMY);
+}

+extern void clockevents_shutdown(struct clock_event_device *dev);
+extern void clockevents_exchange_device(struct clock_event_device *old,
+ struct clock_event_device *new);
+extern void clockevents_set_state(struct clock_event_device *dev,
+ enum clock_event_state state);
+extern int clockevents_program_event(struct clock_event_device *dev,
+ ktime_t expires, bool force);
+extern void clockevents_handle_noop(struct clock_event_device *dev);
+extern int __clockevents_update_freq(struct clock_event_device *dev, u32 freq);
extern ssize_t sysfs_get_uname(const char *buf, char *dst, size_t cnt);

-/*
- * NO_HZ / high resolution timer shared code
- */
+/* Broadcasting support */
+# ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
+extern int tick_device_uses_broadcast(struct clock_event_device *dev, int cpu);
+extern void tick_install_broadcast_device(struct clock_event_device *dev);
+extern int tick_is_broadcast_device(struct clock_event_device *dev);
+extern void tick_shutdown_broadcast(unsigned int cpu);
+extern void tick_suspend_broadcast(void);
+extern void tick_resume_broadcast(void);
+extern bool tick_resume_check_broadcast(void);
+extern void tick_broadcast_init(void);
+extern void tick_set_periodic_handler(struct clock_event_device *dev, int broadcast);
+extern int tick_broadcast_update_freq(struct clock_event_device *dev, u32 freq);
+extern struct tick_device *tick_get_broadcast_device(void);
+extern struct cpumask *tick_get_broadcast_mask(void);
+# else /* !CONFIG_GENERIC_CLOCKEVENTS_BROADCAST: */
+static inline void tick_install_broadcast_device(struct clock_event_device *dev) { }
+static inline int tick_is_broadcast_device(struct clock_event_device *dev) { return 0; }
+static inline int tick_device_uses_broadcast(struct clock_event_device *dev, int cpu) { return 0; }
+static inline void tick_do_periodic_broadcast(struct clock_event_device *d) { }
+static inline void tick_shutdown_broadcast(unsigned int cpu) { }
+static inline void tick_suspend_broadcast(void) { }
+static inline void tick_resume_broadcast(void) { }
+static inline bool tick_resume_check_broadcast(void) { return false; }
+static inline void tick_broadcast_init(void) { }
+static inline int tick_broadcast_update_freq(struct clock_event_device *dev, u32 freq) { return -ENODEV; }
+
+/* Set the periodic handler in non broadcast mode */
+static inline void tick_set_periodic_handler(struct clock_event_device *dev, int broadcast)
+{
+ dev->event_handler = tick_handle_periodic;
+}
+# endif /* !CONFIG_GENERIC_CLOCKEVENTS_BROADCAST */
+
+#else /* !GENERIC_CLOCKEVENTS: */
+static inline void tick_suspend(void) { }
+static inline void tick_resume(void) { }
+#endif /* !GENERIC_CLOCKEVENTS */
+
+/* Oneshot related functions */
#ifdef CONFIG_TICK_ONESHOT
extern void tick_setup_oneshot(struct clock_event_device *newdev,
void (*handler)(struct clock_event_device *),
@@ -46,58 +94,42 @@ extern int tick_program_event(ktime_t expires, int force);
extern void tick_oneshot_notify(void);
extern int tick_switch_to_oneshot(void (*handler)(struct clock_event_device *));
extern void tick_resume_oneshot(void);
-# ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
+static inline bool tick_oneshot_possible(void) { return true; }
+extern int tick_oneshot_mode_active(void);
+extern void tick_clock_notify(void);
+extern int tick_check_oneshot_change(int allow_nohz);
+extern int tick_init_highres(void);
+#else /* !CONFIG_TICK_ONESHOT: */
+static inline
+void tick_setup_oneshot(struct clock_event_device *newdev,
+ void (*handler)(struct clock_event_device *),
+ ktime_t nextevt) { BUG(); }
+static inline void tick_resume_oneshot(void) { BUG(); }
+static inline int tick_program_event(ktime_t expires, int force) { return 0; }
+static inline void tick_oneshot_notify(void) { }
+static inline bool tick_oneshot_possible(void) { return false; }
+static inline int tick_oneshot_mode_active(void) { return 0; }
+static inline void tick_clock_notify(void) { }
+static inline int tick_check_oneshot_change(int allow_nohz) { return 0; }
+#endif /* !CONFIG_TICK_ONESHOT */
+
+/* Functions related to oneshot broadcasting */
+#if defined(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) && defined(CONFIG_TICK_ONESHOT)
extern void tick_broadcast_setup_oneshot(struct clock_event_device *bc);
-extern int tick_broadcast_oneshot_control(unsigned long reason);
extern void tick_broadcast_switch_to_oneshot(void);
-extern void tick_shutdown_broadcast_oneshot(unsigned int *cpup);
-extern int tick_resume_broadcast_oneshot(struct clock_event_device *bc);
+extern void tick_shutdown_broadcast_oneshot(unsigned int cpu);
extern int tick_broadcast_oneshot_active(void);
extern void tick_check_oneshot_broadcast_this_cpu(void);
bool tick_broadcast_oneshot_available(void);
-# else /* BROADCAST */
-static inline void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
-{
- BUG();
-}
-static inline int tick_broadcast_oneshot_control(unsigned long reason) { return 0; }
+extern struct cpumask *tick_get_broadcast_oneshot_mask(void);
+#else /* !(BROADCAST && ONESHOT): */
+static inline void tick_broadcast_setup_oneshot(struct clock_event_device *bc) { BUG(); }
static inline void tick_broadcast_switch_to_oneshot(void) { }
-static inline void tick_shutdown_broadcast_oneshot(unsigned int *cpup) { }
+static inline void tick_shutdown_broadcast_oneshot(unsigned int cpu) { }
static inline int tick_broadcast_oneshot_active(void) { return 0; }
static inline void tick_check_oneshot_broadcast_this_cpu(void) { }
-static inline bool tick_broadcast_oneshot_available(void) { return true; }
-# endif /* !BROADCAST */
-
-#else /* !ONESHOT */
-static inline
-void tick_setup_oneshot(struct clock_event_device *newdev,
- void (*handler)(struct clock_event_device *),
- ktime_t nextevt)
-{
- BUG();
-}
-static inline void tick_resume_oneshot(void)
-{
- BUG();
-}
-static inline int tick_program_event(ktime_t expires, int force)
-{
- return 0;
-}
-static inline void tick_oneshot_notify(void) { }
-static inline void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
-{
- BUG();
-}
-static inline int tick_broadcast_oneshot_control(unsigned long reason) { return 0; }
-static inline void tick_shutdown_broadcast_oneshot(unsigned int *cpup) { }
-static inline int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
-{
- return 0;
-}
-static inline int tick_broadcast_oneshot_active(void) { return 0; }
-static inline bool tick_broadcast_oneshot_available(void) { return false; }
-#endif /* !TICK_ONESHOT */
+static inline bool tick_broadcast_oneshot_available(void) { return tick_oneshot_possible(); }
+#endif /* !(BROADCAST && ONESHOT) */

/* NO_HZ_FULL internal */
#ifdef CONFIG_NO_HZ_FULL
@@ -105,68 +137,3 @@ extern void tick_nohz_init(void);
# else
static inline void tick_nohz_init(void) { }
#endif
-
-/*
- * Broadcasting support
- */
-#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
-extern int tick_device_uses_broadcast(struct clock_event_device *dev, int cpu);
-extern void tick_install_broadcast_device(struct clock_event_device *dev);
-extern int tick_is_broadcast_device(struct clock_event_device *dev);
-extern void tick_broadcast_on_off(unsigned long reason, int *oncpu);
-extern void tick_shutdown_broadcast(unsigned int *cpup);
-extern void tick_suspend_broadcast(void);
-extern int tick_resume_broadcast(void);
-extern void tick_broadcast_init(void);
-extern void
-tick_set_periodic_handler(struct clock_event_device *dev, int broadcast);
-int tick_broadcast_update_freq(struct clock_event_device *dev, u32 freq);
-
-#else /* !BROADCAST */
-
-static inline void tick_install_broadcast_device(struct clock_event_device *dev)
-{
-}
-
-static inline int tick_is_broadcast_device(struct clock_event_device *dev)
-{
- return 0;
-}
-static inline int tick_device_uses_broadcast(struct clock_event_device *dev,
- int cpu)
-{
- return 0;
-}
-static inline void tick_do_periodic_broadcast(struct clock_event_device *d) { }
-static inline void tick_broadcast_on_off(unsigned long reason, int *oncpu) { }
-static inline void tick_shutdown_broadcast(unsigned int *cpup) { }
-static inline void tick_suspend_broadcast(void) { }
-static inline int tick_resume_broadcast(void) { return 0; }
-static inline void tick_broadcast_init(void) { }
-static inline int tick_broadcast_update_freq(struct clock_event_device *dev,
- u32 freq) { return -ENODEV; }
-
-/*
- * Set the periodic handler in non broadcast mode
- */
-static inline void tick_set_periodic_handler(struct clock_event_device *dev,
- int broadcast)
-{
- dev->event_handler = tick_handle_periodic;
-}
-#endif /* !BROADCAST */
-
-/*
- * Check, if the device is functional or a dummy for broadcast
- */
-static inline int tick_device_is_functional(struct clock_event_device *dev)
-{
- return !(dev->features & CLOCK_EVT_FEAT_DUMMY);
-}
-
-int __clockevents_update_freq(struct clock_event_device *dev, u32 freq);
-
-#endif
-
-extern void do_timer(unsigned long ticks);
-extern void update_wall_time(void);
diff --git a/kernel/time/tick-oneshot.c b/kernel/time/tick-oneshot.c
index 7ce740e78e1b..67a64b1670bf 100644
--- a/kernel/time/tick-oneshot.c
+++ b/kernel/time/tick-oneshot.c
@@ -38,7 +38,7 @@ void tick_resume_oneshot(void)
{
struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);

- clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
+ clockevents_set_state(dev, CLOCK_EVT_STATE_ONESHOT);
clockevents_program_event(dev, ktime_get(), true);
}

@@ -50,7 +50,7 @@ void tick_setup_oneshot(struct clock_event_device *newdev,
ktime_t next_event)
{
newdev->event_handler = handler;
- clockevents_set_mode(newdev, CLOCK_EVT_MODE_ONESHOT);
+ clockevents_set_state(newdev, CLOCK_EVT_STATE_ONESHOT);
clockevents_program_event(newdev, next_event, true);
}

@@ -81,7 +81,7 @@ int tick_switch_to_oneshot(void (*handler)(struct clock_event_device *))

td->mode = TICKDEV_MODE_ONESHOT;
dev->event_handler = handler;
- clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
+ clockevents_set_state(dev, CLOCK_EVT_STATE_ONESHOT);
tick_broadcast_switch_to_oneshot();
return 0;
}
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index a4c4edac4528..914259128145 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -34,7 +34,7 @@
/*
* Per cpu nohz control structure
*/
-DEFINE_PER_CPU(struct tick_sched, tick_cpu_sched);
+static DEFINE_PER_CPU(struct tick_sched, tick_cpu_sched);

/*
* The time, when the last jiffy update happened. Protected by jiffies_lock.
@@ -416,6 +416,11 @@ static int __init setup_tick_nohz(char *str)

__setup("nohz=", setup_tick_nohz);

+int tick_nohz_tick_stopped(void)
+{
+ return __this_cpu_read(tick_cpu_sched.tick_stopped);
+}
+
/**
* tick_nohz_update_jiffies - update jiffies when idle was interrupted
*
diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
new file mode 100644
index 000000000000..28b5da3e1a17
--- /dev/null
+++ b/kernel/time/tick-sched.h
@@ -0,0 +1,74 @@
+#ifndef _TICK_SCHED_H
+#define _TICK_SCHED_H
+
+#include <linux/hrtimer.h>
+
+enum tick_device_mode {
+ TICKDEV_MODE_PERIODIC,
+ TICKDEV_MODE_ONESHOT,
+};
+
+struct tick_device {
+ struct clock_event_device *evtdev;
+ enum tick_device_mode mode;
+};
+
+enum tick_nohz_mode {
+ NOHZ_MODE_INACTIVE,
+ NOHZ_MODE_LOWRES,
+ NOHZ_MODE_HIGHRES,
+};
+
+/**
+ * struct tick_sched - sched tick emulation and no idle tick control/stats
+ * @sched_timer: hrtimer to schedule the periodic tick in high
+ * resolution mode
+ * @last_tick: Store the last tick expiry time when the tick
+ * timer is modified for nohz sleeps. This is necessary
+ * to resume the tick timer operation in the timeline
+ * when the CPU returns from nohz sleep.
+ * @tick_stopped: Indicator that the idle tick has been stopped
+ * @idle_jiffies: jiffies at the entry to idle for idle time accounting
+ * @idle_calls: Total number of idle calls
+ * @idle_sleeps: Number of idle calls, where the sched tick was stopped
+ * @idle_entrytime: Time when the idle call was entered
+ * @idle_waketime: Time when the idle was interrupted
+ * @idle_exittime: Time when the idle state was left
+ * @idle_sleeptime: Sum of the time slept in idle with sched tick stopped
+ * @iowait_sleeptime: Sum of the time slept in idle with sched tick stopped, with IO outstanding
+ * @sleep_length: Duration of the current idle sleep
+ * @do_timer_lst: CPU was the last one doing do_timer before going idle
+ */
+struct tick_sched {
+ struct hrtimer sched_timer;
+ unsigned long check_clocks;
+ enum tick_nohz_mode nohz_mode;
+ ktime_t last_tick;
+ int inidle;
+ int tick_stopped;
+ unsigned long idle_jiffies;
+ unsigned long idle_calls;
+ unsigned long idle_sleeps;
+ int idle_active;
+ ktime_t idle_entrytime;
+ ktime_t idle_waketime;
+ ktime_t idle_exittime;
+ ktime_t idle_sleeptime;
+ ktime_t iowait_sleeptime;
+ ktime_t sleep_length;
+ unsigned long last_jiffies;
+ unsigned long next_jiffies;
+ ktime_t idle_expires;
+ int do_timer_last;
+};
+
+extern struct tick_sched *tick_get_tick_sched(int cpu);
+
+extern void tick_setup_sched_timer(void);
+#if defined CONFIG_NO_HZ_COMMON || defined CONFIG_HIGH_RES_TIMERS
+extern void tick_cancel_sched_timer(int cpu);
+#else
+static inline void tick_cancel_sched_timer(int cpu) { }
+#endif
+
+#endif
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 91db94136c10..946acb72179f 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -59,17 +59,15 @@ struct tk_fast {
};

static struct tk_fast tk_fast_mono ____cacheline_aligned;
+static struct tk_fast tk_fast_raw ____cacheline_aligned;

/* flag for if timekeeping is suspended */
int __read_mostly timekeeping_suspended;

-/* Flag for if there is a persistent clock on this platform */
-bool __read_mostly persistent_clock_exist = false;
-
static inline void tk_normalize_xtime(struct timekeeper *tk)
{
- while (tk->tkr.xtime_nsec >= ((u64)NSEC_PER_SEC << tk->tkr.shift)) {
- tk->tkr.xtime_nsec -= (u64)NSEC_PER_SEC << tk->tkr.shift;
+ while (tk->tkr_mono.xtime_nsec >= ((u64)NSEC_PER_SEC << tk->tkr_mono.shift)) {
+ tk->tkr_mono.xtime_nsec -= (u64)NSEC_PER_SEC << tk->tkr_mono.shift;
tk->xtime_sec++;
}
}
@@ -79,20 +77,20 @@ static inline struct timespec64 tk_xtime(struct timekeeper *tk)
struct timespec64 ts;

ts.tv_sec = tk->xtime_sec;
- ts.tv_nsec = (long)(tk->tkr.xtime_nsec >> tk->tkr.shift);
+ ts.tv_nsec = (long)(tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift);
return ts;
}

static void tk_set_xtime(struct timekeeper *tk, const struct timespec64 *ts)
{
tk->xtime_sec = ts->tv_sec;
- tk->tkr.xtime_nsec = (u64)ts->tv_nsec << tk->tkr.shift;
+ tk->tkr_mono.xtime_nsec = (u64)ts->tv_nsec << tk->tkr_mono.shift;
}

static void tk_xtime_add(struct timekeeper *tk, const struct timespec64 *ts)
{
tk->xtime_sec += ts->tv_sec;
- tk->tkr.xtime_nsec += (u64)ts->tv_nsec << tk->tkr.shift;
+ tk->tkr_mono.xtime_nsec += (u64)ts->tv_nsec << tk->tkr_mono.shift;
tk_normalize_xtime(tk);
}

@@ -118,6 +116,117 @@ static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
tk->offs_boot = ktime_add(tk->offs_boot, delta);
}

+#ifdef CONFIG_DEBUG_TIMEKEEPING
+#define WARNING_FREQ (HZ*300) /* 5 minute rate-limiting */
+/*
+ * These simple flag variables are managed
+ * without locks, which is racy, but ok since
+ * we don't really care about being super
+ * precise about how many events were seen,
+ * just that a problem was observed.
+ */
+static int timekeeping_underflow_seen;
+static int timekeeping_overflow_seen;
+
+/* last_warning is only modified under the timekeeping lock */
+static long timekeeping_last_warning;
+
+static void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
+{
+
+ cycle_t max_cycles = tk->tkr_mono.clock->max_cycles;
+ const char *name = tk->tkr_mono.clock->name;
+
+ if (offset > max_cycles) {
+ printk_deferred("WARNING: timekeeping: Cycle offset (%lld) is larger than allowed by the '%s' clock's max_cycles value (%lld): time overflow danger\n",
+ offset, name, max_cycles);
+ printk_deferred(" timekeeping: Your kernel is sick, but tries to cope by capping time updates\n");
+ } else {
+ if (offset > (max_cycles >> 1)) {
+ printk_deferred("INFO: timekeeping: Cycle offset (%lld) is larger than the the '%s' clock's 50%% safety margin (%lld)\n",
+ offset, name, max_cycles >> 1);
+ printk_deferred(" timekeeping: Your kernel is still fine, but is feeling a bit nervous\n");
+ }
+ }
+
+ if (timekeeping_underflow_seen) {
+ if (jiffies - timekeeping_last_warning > WARNING_FREQ) {
+ printk_deferred("WARNING: Underflow in clocksource '%s' observed, time update ignored.\n", name);
+ printk_deferred(" Please report this, consider using a different clocksource, if possible.\n");
+ printk_deferred(" Your kernel is probably still fine.\n");
+ timekeeping_last_warning = jiffies;
+ }
+ timekeeping_underflow_seen = 0;
+ }
+
+ if (timekeeping_overflow_seen) {
+ if (jiffies - timekeeping_last_warning > WARNING_FREQ) {
+ printk_deferred("WARNING: Overflow in clocksource '%s' observed, time update capped.\n", name);
+ printk_deferred(" Please report this, consider using a different clocksource, if possible.\n");
+ printk_deferred(" Your kernel is probably still fine.\n");
+ timekeeping_last_warning = jiffies;
+ }
+ timekeeping_overflow_seen = 0;
+ }
+}
+
+static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
+{
+ cycle_t now, last, mask, max, delta;
+ unsigned int seq;
+
+ /*
+ * Since we're called holding a seqlock, the data may shift
+ * under us while we're doing the calculation. This can cause
+ * false positives, since we'd note a problem but throw the
+ * results away. So nest another seqlock here to atomically
+ * grab the points we are checking with.
+ */
+ do {
+ seq = read_seqcount_begin(&tk_core.seq);
+ now = tkr->read(tkr->clock);
+ last = tkr->cycle_last;
+ mask = tkr->mask;
+ max = tkr->clock->max_cycles;
+ } while (read_seqcount_retry(&tk_core.seq, seq));
+
+ delta = clocksource_delta(now, last, mask);
+
+ /*
+ * Try to catch underflows by checking if we are seeing small
+ * mask-relative negative values.
+ */
+ if (unlikely((~delta & mask) < (mask >> 3))) {
+ timekeeping_underflow_seen = 1;
+ delta = 0;
+ }
+
+ /* Cap delta value to the max_cycles values to avoid mult overflows */
+ if (unlikely(delta > max)) {
+ timekeeping_overflow_seen = 1;
+ delta = tkr->clock->max_cycles;
+ }
+
+ return delta;
+}
+#else
+static inline void timekeeping_check_update(struct timekeeper *tk, cycle_t offset)
+{
+}
+static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
+{
+ cycle_t cycle_now, delta;
+
+ /* read clocksource */
+ cycle_now = tkr->read(tkr->clock);
+
+ /* calculate the delta since the last update_wall_time */
+ delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
+
+ return delta;
+}
+#endif
+
/**
* tk_setup_internals - Set up internals to use clocksource clock.
*
@@ -135,11 +244,16 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
u64 tmp, ntpinterval;
struct clocksource *old_clock;

- old_clock = tk->tkr.clock;
- tk->tkr.clock = clock;
- tk->tkr.read = clock->read;
- tk->tkr.mask = clock->mask;
- tk->tkr.cycle_last = tk->tkr.read(clock);
+ old_clock = tk->tkr_mono.clock;
+ tk->tkr_mono.clock = clock;
+ tk->tkr_mono.read = clock->read;
+ tk->tkr_mono.mask = clock->mask;
+ tk->tkr_mono.cycle_last = tk->tkr_mono.read(clock);
+
+ tk->tkr_raw.clock = clock;
+ tk->tkr_raw.read = clock->read;
+ tk->tkr_raw.mask = clock->mask;
+ tk->tkr_raw.cycle_last = tk->tkr_mono.cycle_last;

/* Do the ns -> cycle conversion first, using original mult */
tmp = NTP_INTERVAL_LENGTH;
@@ -163,11 +277,14 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
if (old_clock) {
int shift_change = clock->shift - old_clock->shift;
if (shift_change < 0)
- tk->tkr.xtime_nsec >>= -shift_change;
+ tk->tkr_mono.xtime_nsec >>= -shift_change;
else
- tk->tkr.xtime_nsec <<= shift_change;
+ tk->tkr_mono.xtime_nsec <<= shift_change;
}
- tk->tkr.shift = clock->shift;
+ tk->tkr_raw.xtime_nsec = 0;
+
+ tk->tkr_mono.shift = clock->shift;
+ tk->tkr_raw.shift = clock->shift;

tk->ntp_error = 0;
tk->ntp_error_shift = NTP_SCALE_SHIFT - clock->shift;
@@ -178,7 +295,8 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
* active clocksource. These value will be adjusted via NTP
* to counteract clock drifting.
*/
- tk->tkr.mult = clock->mult;
+ tk->tkr_mono.mult = clock->mult;
+ tk->tkr_raw.mult = clock->mult;
tk->ntp_err_mult = 0;
}

@@ -193,14 +311,10 @@ static inline u32 arch_gettimeoffset(void) { return 0; }

static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
{
- cycle_t cycle_now, delta;
+ cycle_t delta;
s64 nsec;

- /* read clocksource: */
- cycle_now = tkr->read(tkr->clock);
-
- /* calculate the delta since the last update_wall_time: */
- delta = clocksource_delta(cycle_now, tkr->cycle_last, tkr->mask);
+ delta = timekeeping_get_delta(tkr);

nsec = delta * tkr->mult + tkr->xtime_nsec;
nsec >>= tkr->shift;
@@ -209,25 +323,6 @@ static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
return nsec + arch_gettimeoffset();
}

-static inline s64 timekeeping_get_ns_raw(struct timekeeper *tk)
-{
- struct clocksource *clock = tk->tkr.clock;
- cycle_t cycle_now, delta;
- s64 nsec;
-
- /* read clocksource: */
- cycle_now = tk->tkr.read(clock);
-
- /* calculate the delta since the last update_wall_time: */
- delta = clocksource_delta(cycle_now, tk->tkr.cycle_last, tk->tkr.mask);
-
- /* convert delta to nanoseconds. */
- nsec = clocksource_cyc2ns(delta, clock->mult, clock->shift);
-
- /* If arch requires, add in get_arch_timeoffset() */
- return nsec + arch_gettimeoffset();
-}
-
/**
* update_fast_timekeeper - Update the fast and NMI safe monotonic timekeeper.
* @tkr: Timekeeping readout base from which we take the update
@@ -267,18 +362,18 @@ static inline s64 timekeeping_get_ns_raw(struct timekeeper *tk)
* slightly wrong timestamp (a few nanoseconds). See
* @ktime_get_mono_fast_ns.
*/
-static void update_fast_timekeeper(struct tk_read_base *tkr)
+static void update_fast_timekeeper(struct tk_read_base *tkr, struct tk_fast *tkf)
{
- struct tk_read_base *base = tk_fast_mono.base;
+ struct tk_read_base *base = tkf->base;

/* Force readers off to base[1] */
- raw_write_seqcount_latch(&tk_fast_mono.seq);
+ raw_write_seqcount_latch(&tkf->seq);

/* Update base[0] */
memcpy(base, tkr, sizeof(*base));

/* Force readers back to base[0] */
- raw_write_seqcount_latch(&tk_fast_mono.seq);
+ raw_write_seqcount_latch(&tkf->seq);

/* Update base[1] */
memcpy(base + 1, base, sizeof(*base));
@@ -316,22 +411,33 @@ static void update_fast_timekeeper(struct tk_read_base *tkr)
* of the following timestamps. Callers need to be aware of that and
* deal with it.
*/
-u64 notrace ktime_get_mono_fast_ns(void)
+static __always_inline u64 __ktime_get_fast_ns(struct tk_fast *tkf)
{
struct tk_read_base *tkr;
unsigned int seq;
u64 now;

do {
- seq = raw_read_seqcount(&tk_fast_mono.seq);
- tkr = tk_fast_mono.base + (seq & 0x01);
- now = ktime_to_ns(tkr->base_mono) + timekeeping_get_ns(tkr);
+ seq = raw_read_seqcount(&tkf->seq);
+ tkr = tkf->base + (seq & 0x01);
+ now = ktime_to_ns(tkr->base) + timekeeping_get_ns(tkr);
+ } while (read_seqcount_retry(&tkf->seq, seq));

- } while (read_seqcount_retry(&tk_fast_mono.seq, seq));
return now;
}
+
+u64 ktime_get_mono_fast_ns(void)
+{
+ return __ktime_get_fast_ns(&tk_fast_mono);
+}
EXPORT_SYMBOL_GPL(ktime_get_mono_fast_ns);

+u64 ktime_get_raw_fast_ns(void)
+{
+ return __ktime_get_fast_ns(&tk_fast_raw);
+}
+EXPORT_SYMBOL_GPL(ktime_get_raw_fast_ns);
+
/* Suspend-time cycles value for halted fast timekeeper. */
static cycle_t cycles_at_suspend;

@@ -353,12 +459,17 @@ static cycle_t dummy_clock_read(struct clocksource *cs)
static void halt_fast_timekeeper(struct timekeeper *tk)
{
static struct tk_read_base tkr_dummy;
- struct tk_read_base *tkr = &tk->tkr;
+ struct tk_read_base *tkr = &tk->tkr_mono;

memcpy(&tkr_dummy, tkr, sizeof(tkr_dummy));
cycles_at_suspend = tkr->read(tkr->clock);
tkr_dummy.read = dummy_clock_read;
- update_fast_timekeeper(&tkr_dummy);
+ update_fast_timekeeper(&tkr_dummy, &tk_fast_mono);
+
+ tkr = &tk->tkr_raw;
+ memcpy(&tkr_dummy, tkr, sizeof(tkr_dummy));
+ tkr_dummy.read = dummy_clock_read;
+ update_fast_timekeeper(&tkr_dummy, &tk_fast_raw);
}

#ifdef CONFIG_GENERIC_TIME_VSYSCALL_OLD
@@ -369,8 +480,8 @@ static inline void update_vsyscall(struct timekeeper *tk)

xt = timespec64_to_timespec(tk_xtime(tk));
wm = timespec64_to_timespec(tk->wall_to_monotonic);
- update_vsyscall_old(&xt, &wm, tk->tkr.clock, tk->tkr.mult,
- tk->tkr.cycle_last);
+ update_vsyscall_old(&xt, &wm, tk->tkr_mono.clock, tk->tkr_mono.mult,
+ tk->tkr_mono.cycle_last);
}

static inline void old_vsyscall_fixup(struct timekeeper *tk)
@@ -387,11 +498,11 @@ static inline void old_vsyscall_fixup(struct timekeeper *tk)
* (shifted nanoseconds), and CONFIG_GENERIC_TIME_VSYSCALL_OLD
* users are removed, this can be killed.
*/
- remainder = tk->tkr.xtime_nsec & ((1ULL << tk->tkr.shift) - 1);
- tk->tkr.xtime_nsec -= remainder;
- tk->tkr.xtime_nsec += 1ULL << tk->tkr.shift;
+ remainder = tk->tkr_mono.xtime_nsec & ((1ULL << tk->tkr_mono.shift) - 1);
+ tk->tkr_mono.xtime_nsec -= remainder;
+ tk->tkr_mono.xtime_nsec += 1ULL << tk->tkr_mono.shift;
tk->ntp_error += remainder << tk->ntp_error_shift;
- tk->ntp_error -= (1ULL << tk->tkr.shift) << tk->ntp_error_shift;
+ tk->ntp_error -= (1ULL << tk->tkr_mono.shift) << tk->ntp_error_shift;
}
#else
#define old_vsyscall_fixup(tk)
@@ -456,17 +567,17 @@ static inline void tk_update_ktime_data(struct timekeeper *tk)
*/
seconds = (u64)(tk->xtime_sec + tk->wall_to_monotonic.tv_sec);
nsec = (u32) tk->wall_to_monotonic.tv_nsec;
- tk->tkr.base_mono = ns_to_ktime(seconds * NSEC_PER_SEC + nsec);
+ tk->tkr_mono.base = ns_to_ktime(seconds * NSEC_PER_SEC + nsec);

/* Update the monotonic raw base */
- tk->base_raw = timespec64_to_ktime(tk->raw_time);
+ tk->tkr_raw.base = timespec64_to_ktime(tk->raw_time);

/*
* The sum of the nanoseconds portions of xtime and
* wall_to_monotonic can be greater/equal one second. Take
* this into account before updating tk->ktime_sec.
*/
- nsec += (u32)(tk->tkr.xtime_nsec >> tk->tkr.shift);
+ nsec += (u32)(tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift);
if (nsec >= NSEC_PER_SEC)
seconds++;
tk->ktime_sec = seconds;
@@ -489,7 +600,8 @@ static void timekeeping_update(struct timekeeper *tk, unsigned int action)
memcpy(&shadow_timekeeper, &tk_core.timekeeper,
sizeof(tk_core.timekeeper));

- update_fast_timekeeper(&tk->tkr);
+ update_fast_timekeeper(&tk->tkr_mono, &tk_fast_mono);
+ update_fast_timekeeper(&tk->tkr_raw, &tk_fast_raw);
}

/**
@@ -501,22 +613,23 @@ static void timekeeping_update(struct timekeeper *tk, unsigned int action)
*/
static void timekeeping_forward_now(struct timekeeper *tk)
{
- struct clocksource *clock = tk->tkr.clock;
+ struct clocksource *clock = tk->tkr_mono.clock;
cycle_t cycle_now, delta;
s64 nsec;

- cycle_now = tk->tkr.read(clock);
- delta = clocksource_delta(cycle_now, tk->tkr.cycle_last, tk->tkr.mask);
- tk->tkr.cycle_last = cycle_now;
+ cycle_now = tk->tkr_mono.read(clock);
+ delta = clocksource_delta(cycle_now, tk->tkr_mono.cycle_last, tk->tkr_mono.mask);
+ tk->tkr_mono.cycle_last = cycle_now;
+ tk->tkr_raw.cycle_last = cycle_now;

- tk->tkr.xtime_nsec += delta * tk->tkr.mult;
+ tk->tkr_mono.xtime_nsec += delta * tk->tkr_mono.mult;

/* If arch requires, add in get_arch_timeoffset() */
- tk->tkr.xtime_nsec += (u64)arch_gettimeoffset() << tk->tkr.shift;
+ tk->tkr_mono.xtime_nsec += (u64)arch_gettimeoffset() << tk->tkr_mono.shift;

tk_normalize_xtime(tk);

- nsec = clocksource_cyc2ns(delta, clock->mult, clock->shift);
+ nsec = clocksource_cyc2ns(delta, tk->tkr_raw.mult, tk->tkr_raw.shift);
timespec64_add_ns(&tk->raw_time, nsec);
}

@@ -537,7 +650,7 @@ int __getnstimeofday64(struct timespec64 *ts)
seq = read_seqcount_begin(&tk_core.seq);

ts->tv_sec = tk->xtime_sec;
- nsecs = timekeeping_get_ns(&tk->tkr);
+ nsecs = timekeeping_get_ns(&tk->tkr_mono);

} while (read_seqcount_retry(&tk_core.seq, seq));

@@ -577,8 +690,8 @@ ktime_t ktime_get(void)

do {
seq = read_seqcount_begin(&tk_core.seq);
- base = tk->tkr.base_mono;
- nsecs = timekeeping_get_ns(&tk->tkr);
+ base = tk->tkr_mono.base;
+ nsecs = timekeeping_get_ns(&tk->tkr_mono);

} while (read_seqcount_retry(&tk_core.seq, seq));

@@ -603,8 +716,8 @@ ktime_t ktime_get_with_offset(enum tk_offsets offs)

do {
seq = read_seqcount_begin(&tk_core.seq);
- base = ktime_add(tk->tkr.base_mono, *offset);
- nsecs = timekeeping_get_ns(&tk->tkr);
+ base = ktime_add(tk->tkr_mono.base, *offset);
+ nsecs = timekeeping_get_ns(&tk->tkr_mono);

} while (read_seqcount_retry(&tk_core.seq, seq));

@@ -645,8 +758,8 @@ ktime_t ktime_get_raw(void)

do {
seq = read_seqcount_begin(&tk_core.seq);
- base = tk->base_raw;
- nsecs = timekeeping_get_ns_raw(tk);
+ base = tk->tkr_raw.base;
+ nsecs = timekeeping_get_ns(&tk->tkr_raw);

} while (read_seqcount_retry(&tk_core.seq, seq));

@@ -674,7 +787,7 @@ void ktime_get_ts64(struct timespec64 *ts)
do {
seq = read_seqcount_begin(&tk_core.seq);
ts->tv_sec = tk->xtime_sec;
- nsec = timekeeping_get_ns(&tk->tkr);
+ nsec = timekeeping_get_ns(&tk->tkr_mono);
tomono = tk->wall_to_monotonic;

} while (read_seqcount_retry(&tk_core.seq, seq));
@@ -759,8 +872,8 @@ void getnstime_raw_and_real(struct timespec *ts_raw, struct timespec *ts_real)
ts_real->tv_sec = tk->xtime_sec;
ts_real->tv_nsec = 0;

- nsecs_raw = timekeeping_get_ns_raw(tk);
- nsecs_real = timekeeping_get_ns(&tk->tkr);
+ nsecs_raw = timekeeping_get_ns(&tk->tkr_raw);
+ nsecs_real = timekeeping_get_ns(&tk->tkr_mono);

} while (read_seqcount_retry(&tk_core.seq, seq));

@@ -943,7 +1056,7 @@ static int change_clocksource(void *data)
*/
if (try_module_get(new->owner)) {
if (!new->enable || new->enable(new) == 0) {
- old = tk->tkr.clock;
+ old = tk->tkr_mono.clock;
tk_setup_internals(tk, new);
if (old->disable)
old->disable(old);
@@ -971,11 +1084,11 @@ int timekeeping_notify(struct clocksource *clock)
{
struct timekeeper *tk = &tk_core.timekeeper;

- if (tk->tkr.clock == clock)
+ if (tk->tkr_mono.clock == clock)
return 0;
stop_machine(change_clocksource, clock, NULL);
tick_clock_notify();
- return tk->tkr.clock == clock ? 0 : -1;
+ return tk->tkr_mono.clock == clock ? 0 : -1;
}

/**
@@ -993,7 +1106,7 @@ void getrawmonotonic64(struct timespec64 *ts)

do {
seq = read_seqcount_begin(&tk_core.seq);
- nsecs = timekeeping_get_ns_raw(tk);
+ nsecs = timekeeping_get_ns(&tk->tkr_raw);
ts64 = tk->raw_time;

} while (read_seqcount_retry(&tk_core.seq, seq));
@@ -1016,7 +1129,7 @@ int timekeeping_valid_for_hres(void)
do {
seq = read_seqcount_begin(&tk_core.seq);

- ret = tk->tkr.clock->flags & CLOCK_SOURCE_VALID_FOR_HRES;
+ ret = tk->tkr_mono.clock->flags & CLOCK_SOURCE_VALID_FOR_HRES;

} while (read_seqcount_retry(&tk_core.seq, seq));

@@ -1035,7 +1148,7 @@ u64 timekeeping_max_deferment(void)
do {
seq = read_seqcount_begin(&tk_core.seq);

- ret = tk->tkr.clock->max_idle_ns;
+ ret = tk->tkr_mono.clock->max_idle_ns;

} while (read_seqcount_retry(&tk_core.seq, seq));

@@ -1057,6 +1170,14 @@ void __weak read_persistent_clock(struct timespec *ts)
ts->tv_nsec = 0;
}

+void __weak read_persistent_clock64(struct timespec64 *ts64)
+{
+ struct timespec ts;
+
+ read_persistent_clock(&ts);
+ *ts64 = timespec_to_timespec64(ts);
+}
+
/**
* read_boot_clock - Return time of the system start.
*
@@ -1072,6 +1193,20 @@ void __weak read_boot_clock(struct timespec *ts)
ts->tv_nsec = 0;
}

+void __weak read_boot_clock64(struct timespec64 *ts64)
+{
+ struct timespec ts;
+
+ read_boot_clock(&ts);
+ *ts64 = timespec_to_timespec64(ts);
+}
+
+/* Flag for if timekeeping_resume() has injected sleeptime */
+static bool sleeptime_injected;
+
+/* Flag for if there is a persistent clock on this platform */
+static bool persistent_clock_exists;
+
/*
* timekeeping_init - Initializes the clocksource and common timekeeping values
*/
@@ -1081,20 +1216,17 @@ void __init timekeeping_init(void)
struct clocksource *clock;
unsigned long flags;
struct timespec64 now, boot, tmp;
- struct timespec ts;

- read_persistent_clock(&ts);
- now = timespec_to_timespec64(ts);
+ read_persistent_clock64(&now);
if (!timespec64_valid_strict(&now)) {
pr_warn("WARNING: Persistent clock returned invalid value!\n"
" Check your CMOS/BIOS settings.\n");
now.tv_sec = 0;
now.tv_nsec = 0;
} else if (now.tv_sec || now.tv_nsec)
- persistent_clock_exist = true;
+ persistent_clock_exists = true;

- read_boot_clock(&ts);
- boot = timespec_to_timespec64(ts);
+ read_boot_clock64(&boot);
if (!timespec64_valid_strict(&boot)) {
pr_warn("WARNING: Boot clock returned invalid value!\n"
" Check your CMOS/BIOS settings.\n");
@@ -1114,7 +1246,6 @@ void __init timekeeping_init(void)
tk_set_xtime(tk, &now);
tk->raw_time.tv_sec = 0;
tk->raw_time.tv_nsec = 0;
- tk->base_raw.tv64 = 0;
if (boot.tv_sec == 0 && boot.tv_nsec == 0)
boot = tk_xtime(tk);

@@ -1127,7 +1258,7 @@ void __init timekeeping_init(void)
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
}

-/* time in seconds when suspend began */
+/* time in seconds when suspend began for persistent clock */
static struct timespec64 timekeeping_suspend_time;

/**
@@ -1152,12 +1283,49 @@ static void __timekeeping_inject_sleeptime(struct timekeeper *tk,
tk_debug_account_sleep_time(delta);
}

+#if defined(CONFIG_PM_SLEEP) && defined(CONFIG_RTC_HCTOSYS_DEVICE)
+/**
+ * We have three kinds of time sources to use for sleep time
+ * injection, the preference order is:
+ * 1) non-stop clocksource
+ * 2) persistent clock (ie: RTC accessible when irqs are off)
+ * 3) RTC
+ *
+ * 1) and 2) are used by timekeeping, 3) by RTC subsystem.
+ * If system has neither 1) nor 2), 3) will be used finally.
+ *
+ *
+ * If timekeeping has injected sleeptime via either 1) or 2),
+ * 3) becomes needless, so in this case we don't need to call
+ * rtc_resume(), and this is what timekeeping_rtc_skipresume()
+ * means.
+ */
+bool timekeeping_rtc_skipresume(void)
+{
+ return sleeptime_injected;
+}
+
+/**
+ * 1) can be determined whether to use or not only when doing
+ * timekeeping_resume() which is invoked after rtc_suspend(),
+ * so we can't skip rtc_suspend() surely if system has 1).
+ *
+ * But if system has 2), 2) will definitely be used, so in this
+ * case we don't need to call rtc_suspend(), and this is what
+ * timekeeping_rtc_skipsuspend() means.
+ */
+bool timekeeping_rtc_skipsuspend(void)
+{
+ return persistent_clock_exists;
+}
+
/**
* timekeeping_inject_sleeptime64 - Adds suspend interval to timeekeeping values
* @delta: pointer to a timespec64 delta value
*
- * This hook is for architectures that cannot support read_persistent_clock
+ * This hook is for architectures that cannot support read_persistent_clock64
* because their RTC/persistent clock is only accessible when irqs are enabled.
+ * and also don't have an effective nonstop clocksource.
*
* This function should only be called by rtc_resume(), and allows
* a suspend offset to be injected into the timekeeping values.
@@ -1167,13 +1335,6 @@ void timekeeping_inject_sleeptime64(struct timespec64 *delta)
struct timekeeper *tk = &tk_core.timekeeper;
unsigned long flags;

- /*
- * Make sure we don't set the clock twice, as timekeeping_resume()
- * already did it
- */
- if (has_persistent_clock())
- return;
-
raw_spin_lock_irqsave(&timekeeper_lock, flags);
write_seqcount_begin(&tk_core.seq);

@@ -1189,26 +1350,21 @@ void timekeeping_inject_sleeptime64(struct timespec64 *delta)
/* signal hrtimers about time change */
clock_was_set();
}
+#endif

/**
* timekeeping_resume - Resumes the generic timekeeping subsystem.
- *
- * This is for the generic clocksource timekeeping.
- * xtime/wall_to_monotonic/jiffies/etc are
- * still managed by arch specific suspend/resume code.
*/
void timekeeping_resume(void)
{
struct timekeeper *tk = &tk_core.timekeeper;
- struct clocksource *clock = tk->tkr.clock;
+ struct clocksource *clock = tk->tkr_mono.clock;
unsigned long flags;
struct timespec64 ts_new, ts_delta;
- struct timespec tmp;
cycle_t cycle_now, cycle_delta;
- bool suspendtime_found = false;

- read_persistent_clock(&tmp);
- ts_new = timespec_to_timespec64(tmp);
+ sleeptime_injected = false;
+ read_persistent_clock64(&ts_new);

clockevents_resume();
clocksource_resume();
@@ -1228,16 +1384,16 @@ void timekeeping_resume(void)
* The less preferred source will only be tried if there is no better
* usable source. The rtc part is handled separately in rtc core code.
*/
- cycle_now = tk->tkr.read(clock);
+ cycle_now = tk->tkr_mono.read(clock);
if ((clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP) &&
- cycle_now > tk->tkr.cycle_last) {
+ cycle_now > tk->tkr_mono.cycle_last) {
u64 num, max = ULLONG_MAX;
u32 mult = clock->mult;
u32 shift = clock->shift;
s64 nsec = 0;

- cycle_delta = clocksource_delta(cycle_now, tk->tkr.cycle_last,
- tk->tkr.mask);
+ cycle_delta = clocksource_delta(cycle_now, tk->tkr_mono.cycle_last,
+ tk->tkr_mono.mask);

/*
* "cycle_delta * mutl" may cause 64 bits overflow, if the
@@ -1253,17 +1409,19 @@ void timekeeping_resume(void)
nsec += ((u64) cycle_delta * mult) >> shift;

ts_delta = ns_to_timespec64(nsec);
- suspendtime_found = true;
+ sleeptime_injected = true;
} else if (timespec64_compare(&ts_new, &timekeeping_suspend_time) > 0) {
ts_delta = timespec64_sub(ts_new, timekeeping_suspend_time);
- suspendtime_found = true;
+ sleeptime_injected = true;
}

- if (suspendtime_found)
+ if (sleeptime_injected)
__timekeeping_inject_sleeptime(tk, &ts_delta);

/* Re-base the last cycle value */
- tk->tkr.cycle_last = cycle_now;
+ tk->tkr_mono.cycle_last = cycle_now;
+ tk->tkr_raw.cycle_last = cycle_now;
+
tk->ntp_error = 0;
timekeeping_suspended = 0;
timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
@@ -1272,9 +1430,7 @@ void timekeeping_resume(void)

touch_softlockup_watchdog();

- clockevents_notify(CLOCK_EVT_NOTIFY_RESUME, NULL);
-
- /* Resume hrtimers */
+ tick_resume();
hrtimers_resume();
}

@@ -1284,10 +1440,8 @@ int timekeeping_suspend(void)
unsigned long flags;
struct timespec64 delta, delta_delta;
static struct timespec64 old_delta;
- struct timespec tmp;

- read_persistent_clock(&tmp);
- timekeeping_suspend_time = timespec_to_timespec64(tmp);
+ read_persistent_clock64(&timekeeping_suspend_time);

/*
* On some systems the persistent_clock can not be detected at
@@ -1295,31 +1449,33 @@ int timekeeping_suspend(void)
* value returned, update the persistent_clock_exists flag.
*/
if (timekeeping_suspend_time.tv_sec || timekeeping_suspend_time.tv_nsec)
- persistent_clock_exist = true;
+ persistent_clock_exists = true;

raw_spin_lock_irqsave(&timekeeper_lock, flags);
write_seqcount_begin(&tk_core.seq);
timekeeping_forward_now(tk);
timekeeping_suspended = 1;

- /*
- * To avoid drift caused by repeated suspend/resumes,
- * which each can add ~1 second drift error,
- * try to compensate so the difference in system time
- * and persistent_clock time stays close to constant.
- */
- delta = timespec64_sub(tk_xtime(tk), timekeeping_suspend_time);
- delta_delta = timespec64_sub(delta, old_delta);
- if (abs(delta_delta.tv_sec) >= 2) {
+ if (persistent_clock_exists) {
/*
- * if delta_delta is too large, assume time correction
- * has occured and set old_delta to the current delta.
+ * To avoid drift caused by repeated suspend/resumes,
+ * which each can add ~1 second drift error,
+ * try to compensate so the difference in system time
+ * and persistent_clock time stays close to constant.
*/
- old_delta = delta;
- } else {
- /* Otherwise try to adjust old_system to compensate */
- timekeeping_suspend_time =
- timespec64_add(timekeeping_suspend_time, delta_delta);
+ delta = timespec64_sub(tk_xtime(tk), timekeeping_suspend_time);
+ delta_delta = timespec64_sub(delta, old_delta);
+ if (abs(delta_delta.tv_sec) >= 2) {
+ /*
+ * if delta_delta is too large, assume time correction
+ * has occurred and set old_delta to the current delta.
+ */
+ old_delta = delta;
+ } else {
+ /* Otherwise try to adjust old_system to compensate */
+ timekeeping_suspend_time =
+ timespec64_add(timekeeping_suspend_time, delta_delta);
+ }
}

timekeeping_update(tk, TK_MIRROR);
@@ -1327,7 +1483,7 @@ int timekeeping_suspend(void)
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);

- clockevents_notify(CLOCK_EVT_NOTIFY_SUSPEND, NULL);
+ tick_suspend();
clocksource_suspend();
clockevents_suspend();

@@ -1416,15 +1572,15 @@ static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
*
* XXX - TODO: Doc ntp_error calculation.
*/
- if ((mult_adj > 0) && (tk->tkr.mult + mult_adj < mult_adj)) {
+ if ((mult_adj > 0) && (tk->tkr_mono.mult + mult_adj < mult_adj)) {
/* NTP adjustment caused clocksource mult overflow */
WARN_ON_ONCE(1);
return;
}

- tk->tkr.mult += mult_adj;
+ tk->tkr_mono.mult += mult_adj;
tk->xtime_interval += interval;
- tk->tkr.xtime_nsec -= offset;
+ tk->tkr_mono.xtime_nsec -= offset;
tk->ntp_error -= (interval - offset) << tk->ntp_error_shift;
}

@@ -1486,13 +1642,13 @@ static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
tk->ntp_err_mult = 0;
}

- if (unlikely(tk->tkr.clock->maxadj &&
- (abs(tk->tkr.mult - tk->tkr.clock->mult)
- > tk->tkr.clock->maxadj))) {
+ if (unlikely(tk->tkr_mono.clock->maxadj &&
+ (abs(tk->tkr_mono.mult - tk->tkr_mono.clock->mult)
+ > tk->tkr_mono.clock->maxadj))) {
printk_once(KERN_WARNING
"Adjusting %s more than 11%% (%ld vs %ld)\n",
- tk->tkr.clock->name, (long)tk->tkr.mult,
- (long)tk->tkr.clock->mult + tk->tkr.clock->maxadj);
+ tk->tkr_mono.clock->name, (long)tk->tkr_mono.mult,
+ (long)tk->tkr_mono.clock->mult + tk->tkr_mono.clock->maxadj);
}

/*
@@ -1509,9 +1665,9 @@ static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
* We'll correct this error next time through this function, when
* xtime_nsec is not as small.
*/
- if (unlikely((s64)tk->tkr.xtime_nsec < 0)) {
- s64 neg = -(s64)tk->tkr.xtime_nsec;
- tk->tkr.xtime_nsec = 0;
+ if (unlikely((s64)tk->tkr_mono.xtime_nsec < 0)) {
+ s64 neg = -(s64)tk->tkr_mono.xtime_nsec;
+ tk->tkr_mono.xtime_nsec = 0;
tk->ntp_error += neg << tk->ntp_error_shift;
}
}
@@ -1526,13 +1682,13 @@ static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
*/
static inline unsigned int accumulate_nsecs_to_secs(struct timekeeper *tk)
{
- u64 nsecps = (u64)NSEC_PER_SEC << tk->tkr.shift;
+ u64 nsecps = (u64)NSEC_PER_SEC << tk->tkr_mono.shift;
unsigned int clock_set = 0;

- while (tk->tkr.xtime_nsec >= nsecps) {
+ while (tk->tkr_mono.xtime_nsec >= nsecps) {
int leap;

- tk->tkr.xtime_nsec -= nsecps;
+ tk->tkr_mono.xtime_nsec -= nsecps;
tk->xtime_sec++;

/* Figure out if its a leap sec and apply if needed */
@@ -1577,9 +1733,10 @@ static cycle_t logarithmic_accumulation(struct timekeeper *tk, cycle_t offset,

/* Accumulate one shifted interval */
offset -= interval;
- tk->tkr.cycle_last += interval;
+ tk->tkr_mono.cycle_last += interval;
+ tk->tkr_raw.cycle_last += interval;

- tk->tkr.xtime_nsec += tk->xtime_interval << shift;
+ tk->tkr_mono.xtime_nsec += tk->xtime_interval << shift;
*clock_set |= accumulate_nsecs_to_secs(tk);

/* Accumulate raw time */
@@ -1622,14 +1779,17 @@ void update_wall_time(void)
#ifdef CONFIG_ARCH_USES_GETTIMEOFFSET
offset = real_tk->cycle_interval;
#else
- offset = clocksource_delta(tk->tkr.read(tk->tkr.clock),
- tk->tkr.cycle_last, tk->tkr.mask);
+ offset = clocksource_delta(tk->tkr_mono.read(tk->tkr_mono.clock),
+ tk->tkr_mono.cycle_last, tk->tkr_mono.mask);
#endif

/* Check if there's really nothing to do */
if (offset < real_tk->cycle_interval)
goto out;

+ /* Do some additional sanity checking */
+ timekeeping_check_update(real_tk, offset);
+
/*
* With NO_HZ we may have to accumulate many cycle_intervals
* (think "ticks") worth of time at once. To do this efficiently,
@@ -1784,8 +1944,8 @@ ktime_t ktime_get_update_offsets_tick(ktime_t *offs_real, ktime_t *offs_boot,
do {
seq = read_seqcount_begin(&tk_core.seq);

- base = tk->tkr.base_mono;
- nsecs = tk->tkr.xtime_nsec >> tk->tkr.shift;
+ base = tk->tkr_mono.base;
+ nsecs = tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift;

*offs_real = tk->offs_real;
*offs_boot = tk->offs_boot;
@@ -1816,8 +1976,8 @@ ktime_t ktime_get_update_offsets_now(ktime_t *offs_real, ktime_t *offs_boot,
do {
seq = read_seqcount_begin(&tk_core.seq);

- base = tk->tkr.base_mono;
- nsecs = timekeeping_get_ns(&tk->tkr);
+ base = tk->tkr_mono.base;
+ nsecs = timekeeping_get_ns(&tk->tkr_mono);

*offs_real = tk->offs_real;
*offs_boot = tk->offs_boot;
diff --git a/kernel/time/timekeeping.h b/kernel/time/timekeeping.h
index 1d91416055d5..ead8794b9a4e 100644
--- a/kernel/time/timekeeping.h
+++ b/kernel/time/timekeeping.h
@@ -19,4 +19,11 @@ extern void timekeeping_clocktai(struct timespec *ts);
extern int timekeeping_suspend(void);
extern void timekeeping_resume(void);

+extern void do_timer(unsigned long ticks);
+extern void update_wall_time(void);
+
+extern seqlock_t jiffies_lock;
+
+#define CS_NAME_LEN 32
+
#endif
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 2d3f5c504939..2ece3aa5069c 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -90,8 +90,18 @@ struct tvec_base {
struct tvec tv5;
} ____cacheline_aligned;

+/*
+ * __TIMER_INITIALIZER() needs to set ->base to a valid pointer (because we've
+ * made NULL special, hint: lock_timer_base()) and we cannot get a compile time
+ * pointer to per-cpu entries because we don't know where we'll map the section,
+ * even for the boot cpu.
+ *
+ * And so we use boot_tvec_bases for boot CPU and per-cpu __tvec_bases for the
+ * rest of them.
+ */
struct tvec_base boot_tvec_bases;
EXPORT_SYMBOL(boot_tvec_bases);
+
static DEFINE_PER_CPU(struct tvec_base *, tvec_bases) = &boot_tvec_bases;

/* Functions below help us manage 'deferrable' flag */
@@ -1027,6 +1037,8 @@ int try_to_del_timer_sync(struct timer_list *timer)
EXPORT_SYMBOL(try_to_del_timer_sync);

#ifdef CONFIG_SMP
+static DEFINE_PER_CPU(struct tvec_base, __tvec_bases);
+
/**
* del_timer_sync - deactivate a timer and wait for the handler to finish.
* @timer: the timer to be deactivated
@@ -1532,64 +1544,6 @@ signed long __sched schedule_timeout_uninterruptible(signed long timeout)
}
EXPORT_SYMBOL(schedule_timeout_uninterruptible);

-static int init_timers_cpu(int cpu)
-{
- int j;
- struct tvec_base *base;
- static char tvec_base_done[NR_CPUS];
-
- if (!tvec_base_done[cpu]) {
- static char boot_done;
-
- if (boot_done) {
- /*
- * The APs use this path later in boot
- */
- base = kzalloc_node(sizeof(*base), GFP_KERNEL,
- cpu_to_node(cpu));
- if (!base)
- return -ENOMEM;
-
- /* Make sure tvec_base has TIMER_FLAG_MASK bits free */
- if (WARN_ON(base != tbase_get_base(base))) {
- kfree(base);
- return -ENOMEM;
- }
- per_cpu(tvec_bases, cpu) = base;
- } else {
- /*
- * This is for the boot CPU - we use compile-time
- * static initialisation because per-cpu memory isn't
- * ready yet and because the memory allocators are not
- * initialised either.
- */
- boot_done = 1;
- base = &boot_tvec_bases;
- }
- spin_lock_init(&base->lock);
- tvec_base_done[cpu] = 1;
- base->cpu = cpu;
- } else {
- base = per_cpu(tvec_bases, cpu);
- }
-
-
- for (j = 0; j < TVN_SIZE; j++) {
- INIT_LIST_HEAD(base->tv5.vec + j);
- INIT_LIST_HEAD(base->tv4.vec + j);
- INIT_LIST_HEAD(base->tv3.vec + j);
- INIT_LIST_HEAD(base->tv2.vec + j);
- }
- for (j = 0; j < TVR_SIZE; j++)
- INIT_LIST_HEAD(base->tv1.vec + j);
-
- base->timer_jiffies = jiffies;
- base->next_timer = base->timer_jiffies;
- base->active_timers = 0;
- base->all_timers = 0;
- return 0;
-}
-
#ifdef CONFIG_HOTPLUG_CPU
static void migrate_timer_list(struct tvec_base *new_base, struct list_head *head)
{
@@ -1631,55 +1585,86 @@ static void migrate_timers(int cpu)
migrate_timer_list(new_base, old_base->tv5.vec + i);
}

+ old_base->active_timers = 0;
+ old_base->all_timers = 0;
+
spin_unlock(&old_base->lock);
spin_unlock_irq(&new_base->lock);
put_cpu_var(tvec_bases);
}
-#endif /* CONFIG_HOTPLUG_CPU */

static int timer_cpu_notify(struct notifier_block *self,
unsigned long action, void *hcpu)
{
- long cpu = (long)hcpu;
- int err;
-
- switch(action) {
- case CPU_UP_PREPARE:
- case CPU_UP_PREPARE_FROZEN:
- err = init_timers_cpu(cpu);
- if (err < 0)
- return notifier_from_errno(err);
- break;
-#ifdef CONFIG_HOTPLUG_CPU
+ switch (action) {
case CPU_DEAD:
case CPU_DEAD_FROZEN:
- migrate_timers(cpu);
+ migrate_timers((long)hcpu);
break;
-#endif
default:
break;
}
+
return NOTIFY_OK;
}

-static struct notifier_block timers_nb = {
- .notifier_call = timer_cpu_notify,
-};
+static inline void timer_register_cpu_notifier(void)
+{
+ cpu_notifier(timer_cpu_notify, 0);
+}
+#else
+static inline void timer_register_cpu_notifier(void) { }
+#endif /* CONFIG_HOTPLUG_CPU */

+static void __init init_timer_cpu(struct tvec_base *base, int cpu)
+{
+ int j;

-void __init init_timers(void)
+ BUG_ON(base != tbase_get_base(base));
+
+ base->cpu = cpu;
+ per_cpu(tvec_bases, cpu) = base;
+ spin_lock_init(&base->lock);
+
+ for (j = 0; j < TVN_SIZE; j++) {
+ INIT_LIST_HEAD(base->tv5.vec + j);
+ INIT_LIST_HEAD(base->tv4.vec + j);
+ INIT_LIST_HEAD(base->tv3.vec + j);
+ INIT_LIST_HEAD(base->tv2.vec + j);
+ }
+ for (j = 0; j < TVR_SIZE; j++)
+ INIT_LIST_HEAD(base->tv1.vec + j);
+
+ base->timer_jiffies = jiffies;
+ base->next_timer = base->timer_jiffies;
+}
+
+static void __init init_timer_cpus(void)
{
- int err;
+ struct tvec_base *base;
+ int local_cpu = smp_processor_id();
+ int cpu;

+ for_each_possible_cpu(cpu) {
+ if (cpu == local_cpu)
+ base = &boot_tvec_bases;
+#ifdef CONFIG_SMP
+ else
+ base = per_cpu_ptr(&__tvec_bases, cpu);
+#endif
+
+ init_timer_cpu(base, cpu);
+ }
+}
+
+void __init init_timers(void)
+{
/* ensure there are enough low bits for flags in timer->base pointer */
BUILD_BUG_ON(__alignof__(struct tvec_base) & TIMER_FLAG_MASK);

- err = timer_cpu_notify(&timers_nb, (unsigned long)CPU_UP_PREPARE,
- (void *)(long)smp_processor_id());
- BUG_ON(err != NOTIFY_OK);
-
+ init_timer_cpus();
init_timer_stats();
- register_cpu_notifier(&timers_nb);
+ timer_register_cpu_notifier();
open_softirq(TIMER_SOFTIRQ, run_timer_softirq);
}

diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index 61ed862cdd37..e878c2e0ba45 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -16,10 +16,10 @@
#include <linux/sched.h>
#include <linux/seq_file.h>
#include <linux/kallsyms.h>
-#include <linux/tick.h>

#include <asm/uaccess.h>

+#include "tick-internal.h"

struct timer_list_iter {
int cpu;
@@ -228,9 +228,35 @@ print_tickdevice(struct seq_file *m, struct tick_device *td, int cpu)
print_name_offset(m, dev->set_next_event);
SEQ_printf(m, "\n");

- SEQ_printf(m, " set_mode: ");
- print_name_offset(m, dev->set_mode);
- SEQ_printf(m, "\n");
+ if (dev->set_mode) {
+ SEQ_printf(m, " set_mode: ");
+ print_name_offset(m, dev->set_mode);
+ SEQ_printf(m, "\n");
+ } else {
+ if (dev->set_state_shutdown) {
+ SEQ_printf(m, " shutdown: ");
+ print_name_offset(m, dev->set_state_shutdown);
+ SEQ_printf(m, "\n");
+ }
+
+ if (dev->set_state_periodic) {
+ SEQ_printf(m, " periodic: ");
+ print_name_offset(m, dev->set_state_periodic);
+ SEQ_printf(m, "\n");
+ }
+
+ if (dev->set_state_oneshot) {
+ SEQ_printf(m, " oneshot: ");
+ print_name_offset(m, dev->set_state_oneshot);
+ SEQ_printf(m, "\n");
+ }
+
+ if (dev->tick_resume) {
+ SEQ_printf(m, " resume: ");
+ print_name_offset(m, dev->tick_resume);
+ SEQ_printf(m, "\n");
+ }
+ }

SEQ_printf(m, " event_handler: ");
print_name_offset(m, dev->event_handler);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index c5cefb3c009c..36b6fa88ce5b 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -865,6 +865,19 @@ config SCHED_STACK_END_CHECK
data corruption or a sporadic crash at a later stage once the region
is examined. The runtime overhead introduced is minimal.

+config DEBUG_TIMEKEEPING
+ bool "Enable extra timekeeping sanity checking"
+ help
+ This option will enable additional timekeeping sanity checks
+ which may be helpful when diagnosing issues where timekeeping
+ problems are suspected.
+
+ This may include checks in the timekeeping hotpaths, so this
+ option may have a (very small) performance impact to some
+ workloads.
+
+ If unsure, say N.
+
config TIMER_STATS
bool "Collect kernel timers statistics"
depends on DEBUG_KERNEL && PROC_FS
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/