Re: [PATCH 2/3] task: RCU protect tasks on the runqueue

From: Eric W. Biederman
Date: Thu Sep 05 2019 - 16:03:16 EST


"Paul E. McKenney" <paulmck@xxxxxxxxxx> writes:

> On Tue, Sep 03, 2019 at 10:06:03PM +0200, Peter Zijlstra wrote:
>> On Tue, Sep 03, 2019 at 12:18:47PM -0700, Linus Torvalds wrote:
>> > Now, if you can point to some particular field where that ordering
>> > makes sense for the particular case of "make it active on the
>> > runqueue" vs "look up the task from the runqueue using RCU", then I do
>> > think that the whole release->acquire consistency makes sense.
>> >
>> > But it's not clear that such a field exists, particularly when this is
>> > in no way the *common* way to even get a task pointer, and other paths
>> > do *not* use the runqueue as the serialization point.
>>
>> Even if we could find a case (and I'm not seeing one in a hurry), I
>> would try really hard to avoid adding extra barriers here and instead
>> make the consumer a little more complicated if at all possible.
>>
>> The Power folks got rid of a SYNC (yes, more expensive than LWSYNC) from
>> their __switch_to() implementation and that had a measurable impact.
>>
>> 9145effd626d ("powerpc/64: Drop explicit hwsync in context switch")
>
> The patch [1] looks good to me. And yes, if the structure pointed to by
> the second argument of rcu_assign_pointer() is already visible to readers,
> it is OK to instead use RCU_INIT_POINTER(). Yes, this loses ordering.
> But weren't these simple assignments before RCU got involved?
>
> As a very rough rule of thumb, LWSYNC is about twice as fast as SYNC.
> (Depends on workload, exact details of the hardware, timing, phase of
> the moon, you name it.) So removing the LWSYNC is likely to provide
> measureable benefit, but I must defer to the powerpc maintainers.
> To that end, I added Michael on CC.
>
> [1] https://lore.kernel.org/lkml/878sr6t21a.fsf_-_@xxxxxxxxxxxxxxxxxxxxx/

Paul, what is the purpose of the barrier in rcu_assign_pointer?

My intuition says it is the assignment half of rcu_dereference, and that
anything that rcu_dereference does not need is too strong.

My basic understanding is that only alpha ever has the memory ordering
issue that rcu_dereference deals with. So I am a bit surprised that
this is anything other than a noop for anything except alpha.

In my patch I used rcu_assign_pointer because that is the canonically
correct way to do things. Peter makes a good case that adding an extra
barrier in ___schedule could be detrimental to system performance.
At the same time if there is a correctness issue on alpha that we have
been overlooking because of low testing volume on alpha I don't want to
just let this slide and have very subtle bugs.

The practical concern is that people have been really wanting to do
lockless and rcu operations on tasks in the runqueue for a while and
there are several very clever pieces of code doing that now. By
changing the location of the rcu put I am trying to make these uses
ordinary rcu uses.

The uses in question are the pieces of code I update in:
https://lore.kernel.org/lkml/8736het20c.fsf_-_@xxxxxxxxxxxxxxxxxxxxx/

In short. Why is rcu_assign_pointer() more than WRITE_ONCE() on
anything but alpha? Do other architectures need more? Is the trade off
worth it if we avoid using rcu_assign_pointer on performance critical
paths.

Eric

p.s. I am being slow at working through all of this as I am dealing
with my young baby son, and busy packing for the conference.

I might not be able to get back to this discussion until after
I have landed in Lisbon on Saturday night.