Re: [PATCH 11/13] clocksource/arm_arch_timer: Fix masking for high freq counters

From: Marc Zyngier
Date: Tue Aug 10 2021 - 04:40:42 EST


On Mon, 09 Aug 2021 17:45:28 +0100,
Oliver Upton <oupton@xxxxxxxxxx> wrote:
>
> On Mon, Aug 9, 2021 at 8:48 AM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> >
> > From: Oliver Upton <oupton@xxxxxxxxxx>
> >
> > Unfortunately, the architecture provides no means to determine the bit
> > width of the system counter. However, we do know the following from the
> > specification:
> >
> > - the system counter is at least 56 bits wide
> > - Roll-over time of not less than 40 years
> >
> > To date, the arch timer driver has depended on the first property,
> > assuming any system counter to be 56 bits wide and masking off the rest.
> > However, combining a narrow clocksource mask with a high frequency
> > counter could result in prematurely wrapping the system counter by a
> > significant margin. For example, a 56 bit wide, 1GHz system counter
> > would wrap in a mere 2.28 years!
> >
> > This is a problem for two reasons: v8.6+ implementations are required to
> > provide a 64 bit, 1GHz system counter. Furthermore, before v8.6,
> > implementers may select a counter frequency of their choosing.
> >
> > Fix the issue by deriving a valid clock mask based on the second
> > property from above. Set the floor at 56 bits, since we know no system
> > counter is narrower than that.
> >
> > Suggested-by: Marc Zyngier <maz@xxxxxxxxxx>
> > Signed-off-by: Oliver Upton <oupton@xxxxxxxxxx>
> > Reviewed-by: Linus Walleij <linus.walleij@xxxxxxxxxx>
> > [maz: fixed width computation not to lose the last bit, added
> > max delta generation for the timer]
> > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
> > Link: https://lore.kernel.org/r/20210807191428.3488948-1-oupton@xxxxxxxxxx
> > ---
> > drivers/clocksource/arm_arch_timer.c | 34 ++++++++++++++++++++++++----
> > 1 file changed, 29 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> > index fa09952b94bf..74eca831d0d9 100644
> > --- a/drivers/clocksource/arm_arch_timer.c
> > +++ b/drivers/clocksource/arm_arch_timer.c
> > @@ -52,6 +52,12 @@
> > #define CNTV_CVAL_LO 0x30
> > #define CNTV_CTL 0x3c
> >
> > +/*
> > + * The minimum amount of time a generic counter is guaranteed to not roll over
> > + * (40 years)
> > + */
> > +#define MIN_ROLLOVER_SECS (40ULL * 365 * 24 * 3600)
> > +
> > static unsigned arch_timers_present __initdata;
> >
> > struct arch_timer {
> > @@ -95,6 +101,22 @@ static int __init early_evtstrm_cfg(char *buf)
> > }
> > early_param("clocksource.arm_arch_timer.evtstrm", early_evtstrm_cfg);
> >
> > +/*
> > + * Makes an educated guess at a valid counter width based on the Generic Timer
> > + * specification. Of note:
> > + * 1) the system counter is at least 56 bits wide
> > + * 2) a roll-over time of not less than 40 years
> > + *
> > + * See 'ARM DDI 0487G.a D11.1.2 ("The system counter")' for more details.
> > + */
> > +static int arch_counter_get_width(void)
> > +{
> > + u64 min_cycles = MIN_ROLLOVER_SECS * arch_timer_rate;
> > +
> > + /* guarantee the returned width is within the valid range */
> > + return clamp_val(ilog2(min_cycles - 1) + 1, 56, 64);
> > +}
>
> Reposting thoughts from the original patch:
>
> Reading the ARM ARM
> D11.1.2 'The system counter', I did not find any language that
> suggested the counter saturates the register width before rolling
> over. So, it may be paranoid, but I presumed it to be safer to wrap
> within the guaranteed interval rather (40 years) than assume the
> sanity of the system counter implementation.

I really don't think that would be a likely implementation. The fact
that the ARM ARM only talks about the width of the counter makes it a
strong case that there is no 'ceiling' other than the natural
saturation of the counter, IMO. If a rollover was allowed to occur
before, it would definitely be mentioned.

Think about it: you'd need to implement an extra comparator to drive
the reset of the counter. It would also make the implementation of
CVAL stupidly complicated: how do you handle the set of values that
fit in the counter width, but are out of the counter range?

Even though the architecture is not the clearest thing, I'm expecting
the CPU designers to try and save gates, rather than trying to
implement a GOTCHA, expensive counter... ;-)

Thanks,

M.

--
Without deviation from the norm, progress is not possible.