On 04/11/2016 04:21 PM, Andy Lutomirski wrote:
+I wonder if this could be simplified. Pseudocode:
+ /* Unlock */
+ smp_store_release(&hpet_save.seq, new + 1);
+ local_irq_restore(flags);
+ return (cycle_t)time;
+ }
+ local_irq_restore(flags);
+ seq = new;
+ }
+
+ /*
+ * Wait until the locked sequence number changes which indicates
+ * that the saved HPET value is up-to-date.
+ */
+ while (READ_ONCE(hpet_save.seq) == seq) {
+ /*
+ * Since reading the HPET is much slower than a single
+ * cpu_relax() instruction, we use two here in an attempt
+ * to reduce the amount of cacheline contention in the
+ * hpet_save.seq cacheline.
+ */
+ cpu_relax();
+ cpu_relax();
+ }
+
+ return (cycle_t)READ_ONCE(hpet_save.hpet);
+}
u32 time;
unsigned long flags;
local_irq_save(flags);
if (spin_trylock(&hpet_lock)) {
time = hpet_readl(HPET_COUNTER);
WRITE_ONCE(last_hpet_counter, time);
You will need a spin_unlock(&hpet_lock) here.
} else {
spin_unlock_wait(&hpet_lock);
/* When this function started, hpet_lock was locked. Now it's
unlocked, which means that time is at least as new as whatever the
lock holder returned. */
time = READ_ONCE(last_hpet_counter);
}
local_irq_restore(flags);
return time;
Should be fasterunder heavy contention, too: spinlocks are very nicely
optimized.
I don't think it will be faster. The current spinlock code isn't more optimized than what you can do with a cmpxchg and smp_store_release. In fact, it is what the spinlock code is actually doing. Other differences includes:
1) A CPU will not do local_irq_save/local_irq_restore when the lock is not free.
2) My patch also use a change a sequence number to indicate an updated time stamp is available. So there will be cases where CPUs running your code will have to wait while the those running my code can grab the time stamp and return immediately.