Re: [PATCH] rcu: Is it safe to enter an RCU read-side criticalsection?

From: Frederic Weisbecker
Date: Fri Sep 06 2013 - 14:59:40 EST


On Fri, Sep 06, 2013 at 10:41:17AM -0700, Paul E. McKenney wrote:
> On Fri, Sep 06, 2013 at 10:21:28AM -0700, Eric Dumazet wrote:
> > On Fri, 2013-09-06 at 08:18 -0700, Paul E. McKenney wrote:
> >
> > > int rcu_is_cpu_idle(void)
> > > {
> > > int ret;
> > >
> > > preempt_disable();
> > > ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
> > > preempt_enable();
> > > return ret;
> > > }
> >
> > Paul I find this very confusing.
> >
> > If caller doesn't have preemption disabled, what could be the meaning of
> > this rcu_is_cpu_idle() call ?
> >
> > Its result is meaningless if suddenly thread is preempted, so what is
> > the point ?
> >
> > Sorry if this is obvious to you.
>
> It is a completely fair question. In fact, this might well now be
> pointing to a bug given NO_HZ_FULL.
>
> The assumption is that if you don't have preemption disabled, you had
> better be running on a CPU that RCU is paying attention to. The rationale
> involves preemptible RCU.
>
> Suppose that you just did rcu_read_lock() on a CPU that RCU is paying
> attention to. All is well, and rcu_is_cpu_idle() will return false, as
> expected. Suppose now that it is possible to be preempted and suddenly
> find yourself running on a CPU that RCU is not paying attention to.
> This would have the effect of making your RCU read-side critical section
> be ignored. Therefore, it had better not be possible to be preempted
> from a CPU to which RCU is paying attention to a CPU that RCU is ignoring.
>
> So if rcu_is_cpu_idle() returns false, you had better be guaranteed
> that whatever CPU you are running on (which might well be a different
> one than the rcu_is_cpu_idle() was running on) is being watched by RCU.
>
> So, Frederic, does this still work with NO_HZ_FULL? If not, I believe
> we have a bigger problem than the preempt_disable() in rcu_is_cpu_idle()!

Sure it works well, because the scheduler task entrypoints exit those RCU
extended quiescent states.

Imagine that you're running on an rcu read side critical section on CPU 0, which
is not in extended quiescent state. Now you get preempted in the middle of your
RCU read side critical section (you called rcu_read_lock() but not yet rcu_read_unlock()).

Later on, the task is woken up to be scheduled in CPU 1. If CPU 1 is in extended
quiescent state because it runs is userspace, it receives a scheduler IPI,
then schedule_user() is called by the end of the interrupt and in turns calls rcu_user_exit()
before the task is resumed to the code it was running on CPU 0, in the middle of
the rcu read side extended quiescent state.

See, the key here is the rcu_user_exit() that restore the CPU on RCU's state machine.
There are other possible scheduler entrypoints when a CPU runs in user extended quiescent
state: exception and syscall entries or even preempt_schedule_irq() in case we receive an irq
in the kernel while we haven't yet reached the call to rcu_user_exit()... All of these should
be covered, otherwise you bet RCU would be prompt to warn.

That's why when we call rcu_is_cpu_idle() from an RCU read side critical section, it's legit even
if we can be preempted anytime around it.
And preempt_disable() is probably not even necessary, except perhaps if __get_cpu_var() itself
relies on non-preemptibility for its own correctness on the address calculation.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/