Re: [PATCH 11/13] clocksource/arm_arch_timer: Fix masking for high freq counters

From: Oliver Upton
Date: Tue Aug 10 2021 - 05:10:02 EST


On Tue, Aug 10, 2021 at 1:40 AM Marc Zyngier <maz@xxxxxxxxxx> wrote:
>
> On Mon, 09 Aug 2021 17:45:28 +0100,
> Oliver Upton <oupton@xxxxxxxxxx> wrote:
> >
> > On Mon, Aug 9, 2021 at 8:48 AM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> > >
> > > From: Oliver Upton <oupton@xxxxxxxxxx>
> > >
> > > Unfortunately, the architecture provides no means to determine the bit
> > > width of the system counter. However, we do know the following from the
> > > specification:
> > >
> > > - the system counter is at least 56 bits wide
> > > - Roll-over time of not less than 40 years
> > >
> > > To date, the arch timer driver has depended on the first property,
> > > assuming any system counter to be 56 bits wide and masking off the rest.
> > > However, combining a narrow clocksource mask with a high frequency
> > > counter could result in prematurely wrapping the system counter by a
> > > significant margin. For example, a 56 bit wide, 1GHz system counter
> > > would wrap in a mere 2.28 years!
> > >
> > > This is a problem for two reasons: v8.6+ implementations are required to
> > > provide a 64 bit, 1GHz system counter. Furthermore, before v8.6,
> > > implementers may select a counter frequency of their choosing.
> > >
> > > Fix the issue by deriving a valid clock mask based on the second
> > > property from above. Set the floor at 56 bits, since we know no system
> > > counter is narrower than that.
> > >
> > > Suggested-by: Marc Zyngier <maz@xxxxxxxxxx>
> > > Signed-off-by: Oliver Upton <oupton@xxxxxxxxxx>
> > > Reviewed-by: Linus Walleij <linus.walleij@xxxxxxxxxx>
> > > [maz: fixed width computation not to lose the last bit, added
> > > max delta generation for the timer]
> > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
> > > Link: https://lore.kernel.org/r/20210807191428.3488948-1-oupton@xxxxxxxxxx
> > > ---
> > > drivers/clocksource/arm_arch_timer.c | 34 ++++++++++++++++++++++++----
> > > 1 file changed, 29 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> > > index fa09952b94bf..74eca831d0d9 100644
> > > --- a/drivers/clocksource/arm_arch_timer.c
> > > +++ b/drivers/clocksource/arm_arch_timer.c
> > > @@ -52,6 +52,12 @@
> > > #define CNTV_CVAL_LO 0x30
> > > #define CNTV_CTL 0x3c
> > >
> > > +/*
> > > + * The minimum amount of time a generic counter is guaranteed to not roll over
> > > + * (40 years)
> > > + */
> > > +#define MIN_ROLLOVER_SECS (40ULL * 365 * 24 * 3600)
> > > +
> > > static unsigned arch_timers_present __initdata;
> > >
> > > struct arch_timer {
> > > @@ -95,6 +101,22 @@ static int __init early_evtstrm_cfg(char *buf)
> > > }
> > > early_param("clocksource.arm_arch_timer.evtstrm", early_evtstrm_cfg);
> > >
> > > +/*
> > > + * Makes an educated guess at a valid counter width based on the Generic Timer
> > > + * specification. Of note:
> > > + * 1) the system counter is at least 56 bits wide
> > > + * 2) a roll-over time of not less than 40 years
> > > + *
> > > + * See 'ARM DDI 0487G.a D11.1.2 ("The system counter")' for more details.
> > > + */
> > > +static int arch_counter_get_width(void)
> > > +{
> > > + u64 min_cycles = MIN_ROLLOVER_SECS * arch_timer_rate;
> > > +
> > > + /* guarantee the returned width is within the valid range */
> > > + return clamp_val(ilog2(min_cycles - 1) + 1, 56, 64);
> > > +}
> >
> > Reposting thoughts from the original patch:
> >
> > Reading the ARM ARM
> > D11.1.2 'The system counter', I did not find any language that
> > suggested the counter saturates the register width before rolling
> > over. So, it may be paranoid, but I presumed it to be safer to wrap
> > within the guaranteed interval rather (40 years) than assume the
> > sanity of the system counter implementation.
>
> I really don't think that would be a likely implementation. The fact
> that the ARM ARM only talks about the width of the counter makes it a
> strong case that there is no 'ceiling' other than the natural
> saturation of the counter, IMO. If a rollover was allowed to occur
> before, it would definitely be mentioned.
>
> Think about it: you'd need to implement an extra comparator to drive
> the reset of the counter. It would also make the implementation of
> CVAL stupidly complicated: how do you handle the set of values that
> fit in the counter width, but are out of the counter range?
>
> Even though the architecture is not the clearest thing, I'm expecting
> the CPU designers to try and save gates, rather than trying to
> implement a GOTCHA, expensive counter... ;-)

Fair, I'll put the tinfoil away then :) Just wanted to avoid reading
between the lines, but it would be rather stunning to encounter
hardware in the wild that does this. Your additions to the patch LGTM.

--
Thanks,
Oliver