Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

From: Paul E. McKenney
Date: Mon Jun 02 2014 - 17:02:42 EST


On Mon, Jun 02, 2014 at 01:22:10PM -0700, Linus Torvalds wrote:
> On Mon, Jun 2, 2014 at 1:05 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > So the question is, do you prefer subtly broken code or hard compile
> > fails? Me, I go for the compile fail.
>
> The thing is, parisc has a perfectly fine "cmpxchg" implementation in
> practice, and ACCESS_ONCE() and friends work fine too for reading.
>
> What the "use a spinlock" approach cannot generally do is:
>
> - ACCESS_ONCE() to _write_ things doesn't work well. You really
> should use "atomic_set()".
>
> - you may not necessarily be able to mix partial updates (ie
> differently sized updates to the same thing) depending on just how the
> spinlock hashing works
>
> but both of those are really rare issues and don't affect normal code.
>
> I would not necessarily be opposed to splitting up ACCESS_ONCE() for
> reading and for writing, and maybe we could do something special for
> the writing path (which tends to be less ctitical). It's really mixing
> "ACCESS_ONCE(x)" to _set_ a value, together with atomic ops to update
> it, that ends up being problematic.

Knowing what I know now about how ACCESS_ONCE() is used, I would have
split it for reading and writing to begin with. Where is that time
machine when you need it? ;-)

> Maybe there are other issues I can't think of right now. But
> basically, parisc _can_ do cmpxchg, it's just that the code needs to
> be somewhat sanitized.
>
> Side note: some of the RCU code uses "ACCESS_ONCE()" for
> read-modify-write code, which is just f*cking crazy. The semantics are
> dubious, and it generally makes gcc create bad code too.

A couple of the places are admittedly overkill, for example the pair of:

ACCESS_ONCE(rsp->n_force_qs_lh)++;

which is just for debug statistics in any case. I could put these back
to "rsp->n_force_qs_lh++;" without problems, if desired. (Yeah, I know,
I am overly paranoid.)

However, these cases need a bit more care:

ACCESS_ONCE(rdp->qlen)++;
ACCESS_ONCE(rsp->n_barrier_done)++;
ACCESS_ONCE(sync_rcu_preempt_exp_count)++;

In the ->qlen case, interrupts are disabled and the current CPU is
the only one who can write, so the read need not be volatile. In the
->n_barrier_done, modifications are done holding ->barrier_mutex, so again
the read need not be volatile. In the sync_rcu_preempt_exp_count case,
modifications are done holding sync_rcu_preempt_exp_mutex, so once again,
the read need not be volatile. So I could do something like:

ACCESS_ONCE(rdp->qlen) = rdp->qlen + 1;

But that still makes gcc generate bad code.

The reason I was not all that worried about this is that these are not
in fastpaths, and the last two are especially not in fastpaths.

Suggestions?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/