Re: [RESEND PATCH v4] x86/hpet: Reduce HPET counter read contention

From: Waiman Long
Date: Thu Aug 11 2016 - 19:22:15 EST


On 08/11/2016 03:32 PM, Dave Hansen wrote:
On 08/10/2016 11:29 AM, Waiman Long wrote:
+static cycle_t read_hpet(struct clocksource *cs)
+{
+ int seq;
+
+ seq = READ_ONCE(hpet_save.seq);
+ if (!HPET_SEQ_LOCKED(seq)) {
...
+ }
+
+ /*
+ * Wait until the locked sequence number changes which indicates
+ * that the saved HPET value is up-to-date.
+ */
+ while (READ_ONCE(hpet_save.seq) == seq) {
+ /*
+ * Since reading the HPET is much slower than a single
+ * cpu_relax() instruction, we use two here in an attempt
+ * to reduce the amount of cacheline contention in the
+ * hpet_save.seq cacheline.
+ */
+ cpu_relax();
+ cpu_relax();
+ }
+
+ return (cycle_t)READ_ONCE(hpet_save.hpet);
+}
It's a real bummer that this all has to be open-coded. I have to wonder
if there were any alternatives that you tried that were simpler.

What do you mean by "open-coded"? Do you mean the function can be inlined?


Is READ_ONCE()/smp_store_release() really strong enough here? It
guarantees ordering, but you need ordering *and* a guarantee that your
write is visible to the reader. Don't you need actual barriers for
that? Otherwise, you might be seeing a stale HPET value, and the spin
loop that you did waiting for it to be up-to-date was worthless. The
seqlock code, uses barriers, btw.

The cmpxchg() and smp_store_release() act as the lock/unlock sequence with the proper barriers. Another important point is that the hpet value is visible to the other readers before the sequence number. This is what the smp_store_release() is providing. cmpxchg is an actual barrier, even though smp_store_release() is not. However, the x86 architecture will guarantee the writes are in order, I think.

Also, since you're fundamentally reading a second-hand HPET value, does
that have any impact on the precision of the HPET as a timesource? Or,
is it so coarse already that this isn't an issue?

There can always be unexpected latency in the returned time value, such as an interrupt or NMI. I think as long as the time won't go backward, it should be fine.

Cheers,
Longman