Re: [RFC PATCH v2 12/14] x86/watchdog/hardlockup/hpet: Determine if HPET timer caused NMI

From: Ricardo Neri
Date: Tue Apr 09 2019 - 21:21:10 EST


On Tue, Apr 09, 2019 at 01:28:17PM +0200, Peter Zijlstra wrote:
> On Wed, Feb 27, 2019 at 08:05:16AM -0800, Ricardo Neri wrote:
> > @@ -62,7 +67,18 @@ static inline void set_comparator(struct hpet_hld_data *hdata,
> > static void kick_timer(struct hpet_hld_data *hdata, bool force)
> > {
> > bool kick_needed = force || !(hdata->flags & HPET_DEV_PERI_CAP);
> > - unsigned long new_compare, count;
> > + unsigned long tsc_curr, tsc_delta, new_compare, count;
> > +
> > + /* Start obtaining the current TSC and HPET counts. */
> > + tsc_curr = rdtsc();
> > +
> > + if (kick_needed)
> > + count = get_count();
> > +
> > + tsc_delta = (unsigned long)watchdog_thresh * (unsigned long)tsc_khz
> > + * 1000L;
> > + hdata->tsc_next = tsc_curr + tsc_delta;
> > + hdata->tsc_next_error = tsc_delta >> 6;
>
> What do we need a per hld_data tsc_next_error for? It is basically a
> global 'constant'.
>

This is true. I thought I'd keep all the needed variables in a single
struct to make the code more readable. I guess, I did not achieve that
goal. I'll put it as a static global variable.

> > /*
> > * Update the comparator in increments of watch_thresh seconds relative
> > @@ -74,8 +90,6 @@ static void kick_timer(struct hpet_hld_data *hdata, bool force)
> > */
> >
> > if (kick_needed) {
> > - count = get_count();
> > -
> > new_compare = count + watchdog_thresh * hdata->ticks_per_second;
> >
> > set_comparator(hdata, new_compare);
> > @@ -147,6 +161,14 @@ static void set_periodic(struct hpet_hld_data *hdata)
> > */
> > static bool is_hpet_wdt_interrupt(struct hpet_hld_data *hdata)
> > {
> > + if (smp_processor_id() == hdata->handling_cpu) {
> > + unsigned long tsc_curr;
>
> TSC is u64

In x86_64, isn't u64 an unsigned long? Do you mean to consider the
32-bit case?

>
> > +
> > + tsc_curr = rdtsc();
> > + if (abs(tsc_curr - hdata->tsc_next) < hdata->tsc_next_error)
> > + return true;
>
> You can write that as:
>
> (tsc_curr - hdata->tsc_next) + tsc_error < 2*tsc_error
>
> which doesn't contain any branches what so ever.
>

Sure, I'll add this change.

Thanks and BR,
Ricardo