Re: [PATCH v2] clocksource/arm_arch_timer: Fix masking for high freq counters

From: Marc Zyngier
Date: Mon Aug 09 2021 - 07:07:57 EST


On Sat, 07 Aug 2021 20:14:28 +0100,
Oliver Upton <oupton@xxxxxxxxxx> wrote:
>
> Unfortunately, the architecture provides no means to determine the bit
> width of the system counter. However, we do know the following from the
> specification:
>
> - the system counter is at least 56 bits wide
> - Roll-over time of not less than 40 years
>
> To date, the arch timer driver has depended on the first property,
> assuming any system counter to be 56 bits wide and masking off the rest.
> However, combining a narrow clocksource mask with a high frequency
> counter could result in prematurely wrapping the system counter by a
> significant margin. For example, a 56 bit wide, 1GHz system counter
> would wrap in a mere 2.28 years!
>
> This is a problem for two reasons: v8.6+ implementations are required to
> provide a 64 bit, 1GHz system counter. Furthermore, before v8.6,
> implementers may select a counter frequency of their choosing.
>
> Fix the issue by deriving a valid clock mask based on the second
> property from above. Set the floor at 56 bits, since we know no system
> counter is narrower than that.
>
> Suggested-by: Marc Zyngier <maz@xxxxxxxxxx>
> Signed-off-by: Oliver Upton <oupton@xxxxxxxxxx>
> ---
> This patch was tested on QEMU, tweaked to provide a 1GHz system counter
> frequency. The 'bp.refcounter.base_frequency' property does not seem to
> have any affect on the 'ARMvA Base RevC AEM FVP', and instead provides a
> 100MHz counter.
>
> Parent commit: 0c32706dac1b ("arm64: stacktrace: avoid tracing arch_stack_walk()")
>
> drivers/clocksource/arm_arch_timer.c | 32 ++++++++++++++++++++++++----
> 1 file changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index be6d741d404c..f4816b22213c 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -52,6 +52,12 @@
> #define CNTV_TVAL 0x38
> #define CNTV_CTL 0x3c
>
> +/*
> + * The minimum amount of time a generic counter is guaranteed to not roll over
> + * (40 years)
> + */
> +#define MIN_ROLLOVER_SECS (40ULL * 365 * 24 * 3600)
> +
> static unsigned arch_timers_present __initdata;
>
> static void __iomem *arch_counter_base __ro_after_init;
> @@ -205,13 +211,11 @@ static struct clocksource clocksource_counter = {
> .id = CSID_ARM_ARCH_COUNTER,
> .rating = 400,
> .read = arch_counter_read,
> - .mask = CLOCKSOURCE_MASK(56),
> .flags = CLOCK_SOURCE_IS_CONTINUOUS,
> };
>
> static struct cyclecounter cyclecounter __ro_after_init = {
> .read = arch_counter_read_cc,
> - .mask = CLOCKSOURCE_MASK(56),
> };
>
> struct ate_acpi_oem_info {
> @@ -1004,9 +1008,26 @@ struct arch_timer_kvm_info *arch_timer_get_kvm_info(void)
> return &arch_timer_kvm_info;
> }
>
> +/*
> + * Makes an educated guess at a valid counter width based on the Generic Timer
> + * specification. Of note:
> + * 1) the system counter is at least 56 bits wide
> + * 2) a roll-over time of not less than 40 years
> + *
> + * See 'ARM DDI 0487G.a D11.1.2 ("The system counter")' for more details.
> + */
> +static int __init arch_counter_get_width(void)
> +{
> + u64 min_cycles = MIN_ROLLOVER_SECS * arch_timer_rate;
> +
> + /* guarantee the returned width is within the valid range */
> + return clamp_val(ilog2(min_cycles), 56, 64);

See my comment somewhere else in the thread about the potential wasted
bit.

> +}
> +
> static void __init arch_counter_register(unsigned type)
> {
> u64 start_count;
> + int width;
>
> /* Register the CP15 based counter if we have one */
> if (type & ARCH_TIMER_TYPE_CP15) {
> @@ -1031,6 +1052,10 @@ static void __init arch_counter_register(unsigned type)
> arch_timer_read_counter = arch_counter_get_cntvct_mem;
> }
>
> + width = arch_counter_get_width();
> + clocksource_counter.mask = CLOCKSOURCE_MASK(width);
> + cyclecounter.mask = CLOCKSOURCE_MASK(width);
> +
> if (!arch_counter_suspend_stop)
> clocksource_counter.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP;
> start_count = arch_timer_read_counter();
> @@ -1040,8 +1065,7 @@ static void __init arch_counter_register(unsigned type)
> timecounter_init(&arch_timer_kvm_info.timecounter,
> &cyclecounter, start_count);
>
> - /* 56 bits minimum, so we assume worst case rollover */
> - sched_clock_register(arch_timer_read_counter, 56, arch_timer_rate);
> + sched_clock_register(arch_timer_read_counter, width, arch_timer_rate);

For the record, there is one spot where the clockevent gets registered
and configured that also needs addressing (there is a mask harcoded to
31 bits there, which is pretty odd). It cannot be fixed directly in
this patch though. I'll probably take this patch on top of my series
and adjust the relevant bits.

M.

--
Without deviation from the norm, progress is not possible.