Re: [PATCH 3/7] barrier: a scalable synchonisation barrier
From: Oleg Nesterov
Date: Sat Feb 03 2007 - 11:39:42 EST
On 01/31, Paul E. McKenney wrote:
>
> QRCU as currently written (http://lkml.org/lkml/2006/11/29/330) doesn't
> do what you want, as it acquires the lock unconditionally. I am proposing
> that synchronize_qrcu() change to something like the following:
>
> void synchronize_qrcu(struct qrcu_struct *qp)
> {
> int idx;
>
> smp_mb();
>
> if (atomic_read(qp->ctr[0]) + atomic_read(qp->ctr[1]) <= 1) {
> smp_rmb();
> if (atomic_read(qp->ctr[0]) +
> atomic_read(qp->ctr[1]) <= 1)
> goto out;
> }
>
> mutex_lock(&qp->mutex);
> idx = qp->completed & 0x1;
> atomic_inc(qp->ctr + (idx ^ 0x1));
> /* Reduce the likelihood that qrcu_read_lock() will loop */
> smp_mb__after_atomic_inc();
I almost forgot. Currently this smp_mb__after_atomic_inc() is not strictly
needed, and the comment is correct. However, it becomes mandatory with your
optimization. Without this barrier, it is possible that both checks above
mutex_lock() will see the result of atomic_dec(), but not the atomic_inc().
So, may I ask you to also update this comment?
/*
* Reduce the likelihood that qrcu_read_lock() will loop
* AND
* make sure the second re-check above will see the result
* of atomic_inc() if it sees the result of atomic_dec()
*/
Something like this, I hope you will make it better.
And another note: this all assumes that STORE-MB-LOAD works "correctly", yes?
We have other code which relies on that, should not be a problem.
(Alan Stern cc'ed).
Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/