On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote:
Let me explain (even though I propose taking all the LFSR stuff out).I don't understand what you meant here.+void __init __pv_init_lock_hash(void)pv_taps = lfsr_taps(pv_lock_hash_bits);
+ int pv_hash_size = 4 * num_possible_cpus();
+ if (pv_hash_size< (1U<< LFSR_MIN_BITS))
+ pv_hash_size = (1U<< LFSR_MIN_BITS);
+ * Allocate space from bootmem which should be page-size aligned
+ * and hence cacheline aligned.
+ pv_lock_hash = alloc_large_system_hash("PV qspinlock",
+ sizeof(struct pv_hash_bucket),
+ pv_hash_size, 0, HASH_EARLY,
+ &pv_lock_hash_bits, NULL,
+ pv_hash_size, pv_hash_size);
pv_lock_hash_bit is a runtime variable, therefore it cannot compile time
evaluate the forest of if statements required to compute the taps value.
Therefore its best to compute the taps _once_, and this seems like a
good place to do so.
Still, no reason to not pre-compute the taps value, its simple enough.The minimum bits size is now 8. So unless the system has more than 64 vCPUs,+ goto done;Since pv_lock_hash_bits is a variable, you end up running through that
+ hash = lfsr(hash, pv_lock_hash_bits, 0);
massive if() forest to find the corresponding tap every single time. It
cannot compile-time optimize it.
it will get the right value in the first if statement.