Re: [PATCH 2/3] task: RCU protect tasks on the runqueue

From: Linus Torvalds
Date: Tue Sep 03 2019 - 15:19:10 EST

On Tue, Sep 3, 2019 at 11:13 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
> I think this is where I am looking a things differently than you and
> Peter. Why does it have to be ___schedule() that changes the value
> in the task_struct? Why can't it be something else that changes the
> value and then proceeds to call schedule()?

No, I think we're in violent agreement here: it's _not_ necessary
schedule that changes any values at all. The values behind the pointer
are live both before - and even more so _after_ - we put the process
on the percpu rq.

> What is the size of the window of changes that is relevant?

It's not the _size_ of the window that is relevant. It's the _direction_.

A "smp_store_release()" is only an ordering wrt previous changes. Why
would _previous_ changes be special? They aren't. In many ways, the
task struct before it is on the runqueue is fairly static. It's only
_after_ it is on the runqueue that the process starts doing things to
its own data structures.

> If we use RCU_INIT_POINTER if there was something that changed
> task_struct and then called schedule() what ensures that a remote cpu
> that has a stale copy of task_struct cached will update it's cache
> after following the new value rq->curr? Don't we need
> rcu_assign_pointer to get that guarantee?

Why are those "before you called schedule" modifications special?

It's _way_ more likely that the task struct fields will change _after_
it was scheduled in.

So in many ways, the scheduling point isn't really the most natural
place for a barrier. It's just an event. But it's not clear why
"changes before that" should be synchronized or be a special case. The
process was visible other ways long before it's being actively run.

So why add a barrier to the scheduler when it's not clear that it makes sense?

Now, if you can point to some particular field where that ordering
makes sense for the particular case of "make it active on the
runqueue" vs "look up the task from the runqueue using RCU", then I do
think that the whole release->acquire consistency makes sense.

But it's not clear that such a field exists, particularly when this is
in no way the *common* way to even get a task pointer, and other paths
do *not* use the runqueue as the serialization point.

See what I'm saying?

Is the runqueue addition point special for synchronization? I don't
see it, and it historically has never been.

But *IF* it is, then yes, then rcu_assign_pointer() would make sense.