Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock

From: Waiman Long
Date: Thu Apr 09 2015 - 16:37:12 EST

Next message: Mark Brown: "Re: [PATCH v6 3/6] regulator: axp20x: prepare support for multiple AXP chip families"
Previous message: Jason Low: "Re: [PATCH 2/2] locking/rwsem: Use a return variable in rwsem_spin_on_owner()"
In reply to: Peter Zijlstra: "Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock"
Next in thread: Waiman Long: "Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 04/09/2015 02:23 PM, Peter Zijlstra wrote:

On Thu, Apr 09, 2015 at 08:13:27PM +0200, Peter Zijlstra wrote:

On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote:

+#define PV_HB_PER_LINE (SMP_CACHE_BYTES / sizeof(struct pv_hash_bucket))
+static struct qspinlock **pv_hash(struct qspinlock *lock, struct pv_node *node)
+{
+ unsigned long init_hash, hash = hash_ptr(lock, pv_lock_hash_bits);
+ struct pv_hash_bucket *hb, *end;
+
+ if (!hash)
+ hash = 1;
+
+ init_hash = hash;
+ hb =&pv_lock_hash[hash_align(hash)];
+ for (;;) {
+ for (end = hb + PV_HB_PER_LINE; hb< end; hb++) {
+ if (!cmpxchg(&hb->lock, NULL, lock)) {
+ WRITE_ONCE(hb->node, node);
+ /*
+ * We haven't set the _Q_SLOW_VAL yet. So
+ * the order of writing doesn't matter.
+ */
+ smp_wmb(); /* matches rmb from pv_hash_find */
+ goto done;
+ }
+ }
+
+ hash = lfsr(hash, pv_lock_hash_bits, 0);

Since pv_lock_hash_bits is a variable, you end up running through that
massive if() forest to find the corresponding tap every single time. It
cannot compile-time optimize it.

Hence:
hash = lfsr(hash, pv_taps);

(I don't get the bits argument to the lfsr).

In any case, like I said before, I think we should try a linear probe
sequence first, the lfsr was over engineering from my side.

+ hb =&pv_lock_hash[hash_align(hash)];

So one thing this does -- and one of the reasons I figured I should
ditch the LFSR instead of fixing it -- is that you end up scanning each
bucket HB_PER_LINE times.

I am aware of that when I was trying to add the hash table debug code, but I want to get the code out for review and so hasn't made any change yet. I have just done testing by adding some debug code to check the hashing efficiency. With the kernel build workload, with over 1M calls to pv_hash(), all of them get an empty entry on the first try. Maybe the minimum hash table size of 256 helps.

The 'fix' would be to LFSR on cachelines instead of HBs but then you're
stuck with the 0-th cacheline.

This should not be a big problem. I just need to add a check at the end of the for loop that if hash is 0, change it to a certain non-0 value instead of calling lfsr().

As for ditching the lfsr idea, I am fine with that. So there will be 4 entries (1 cacheline) for each hash value. If all the entries are full, we proceed to the next cacheline. Right?

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Mark Brown: "Re: [PATCH v6 3/6] regulator: axp20x: prepare support for multiple AXP chip families"
Previous message: Jason Low: "Re: [PATCH 2/2] locking/rwsem: Use a return variable in rwsem_spin_on_owner()"
In reply to: Peter Zijlstra: "Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock"
Next in thread: Waiman Long: "Re: [PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]