Re: [BELATED CORE TOPIC] context tracking / nohz / RCU state

From: Frederic Weisbecker
Date: Wed Aug 12 2015 - 10:52:32 EST

On Tue, Aug 11, 2015 at 12:07:54PM -0700, Andy Lutomirski wrote:
> On Tue, Aug 11, 2015 at 11:33 AM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > On Tue, Aug 11, 2015 at 10:49:36AM -0700, Andy Lutomirski wrote:
> >> This is a bit late, but here goes anyway.
> >>
> >> Having played with the x86 context tracking hooks for awhile, I think
> >> it would be nice if core code that needs to be aware of CPU context
> >> (kernel, user, idle, guest, etc) could come up with single,
> >> comprehensible, easily validated set of hooks that arch code is
> >> supposed to call.
> >>
> >> Currently we have:
> >>
> >> - RCU hooks, which come in a wide variety to notify about IRQs, NMIs, etc.
> >
> > Something about people yelling at me for waking up idle CPUs, thus
> > degrading their battery lifetimes. ;-)
> >
> >> - Context tracking hooks. Only used by some arches. Calling these
> >> calls the RCU hooks for you in most cases. They have weird
> >> interactions with interrupts and they're slow.
> >
> > Combining these would be good, but there are subtleties. For example,
> > some arches don't have context tracking, but RCU still needs to correctly
> > identify idle CPUs without in any way interrupting or awakening that CPU.
> > It would be good to make this faster, but it does have to work.
> Could we maybe have one set of old RCU-only (no context tracking)
> callbacks and a completely separate set of callbacks for arches that
> support full context tracking? The implementation of the latter would
> presumably call into RCU.

That's already what we do I think.

rcu_idle_enter()/rcu_idle_exit() are the old RCU-only stuffs and the rest
(rcu_user_exit()/enter()) uses context tracking.

> >> may_i_turn_off_ticks_right_now()
> >
> > This is RCU if CONFIG_RCU_FAST_NO_HZ=n.
> >
> >> or, better yet:
> >> i_am_turning_off_ticks_right_now_and_register_your_own_darned_hrtimer_if_thats_a_problem()
> >
> > This is RCU if CONFIG_RCU_FAST_NO_HZ=y. It would not be difficult to
> > make RCU do this if CONFIG_RCU_FAST_NO_HZ=n as well, but doing so would
> > increase to/from idle overhead.
> If things actually end up using hrtimers, we might also want
> get_off_my_lawn() aka "isolate this cpu now and try to do all the
> deferred stuff right now and kill off those hrtimers".

Yeah that's what we are trying to do. But hrtimers aren't special here,
they are noise just like any other.

> Rik is (was?) trying to make some housekeeper CPU probe other CPUs'
> state to eliminate the need for exact vtime accounting and thus speed
> up transitions to/from user or idle.

Only user. And that's only about vtime. RCU still needs to be handled

> It would be really neat if we
> could simultaneously have quick idle/user transitions *and* avoid
> deferred per-cpu work interrupting idle/user mode.

I think that's the goal. If we eventually offline the vtime accounting,
all that remains is RCU hooks on user/kernel transitions.
