Re: [BELATED CORE TOPIC] context tracking / nohz / RCU state

From: Paul E. McKenney
Date: Tue Aug 11 2015 - 14:33:23 EST


On Tue, Aug 11, 2015 at 10:49:36AM -0700, Andy Lutomirski wrote:
> This is a bit late, but here goes anyway.
>
> Having played with the x86 context tracking hooks for awhile, I think
> it would be nice if core code that needs to be aware of CPU context
> (kernel, user, idle, guest, etc) could come up with single,
> comprehensible, easily validated set of hooks that arch code is
> supposed to call.
>
> Currently we have:
>
> - RCU hooks, which come in a wide variety to notify about IRQs, NMIs, etc.

Something about people yelling at me for waking up idle CPUs, thus
degrading their battery lifetimes. ;-)

> - Context tracking hooks. Only used by some arches. Calling these
> calls the RCU hooks for you in most cases. They have weird
> interactions with interrupts and they're slow.

Combining these would be good, but there are subtleties. For example,
some arches don't have context tracking, but RCU still needs to correctly
identify idle CPUs without in any way interrupting or awakening that CPU.
It would be good to make this faster, but it does have to work.

> - vtime. Beats the heck out of me.
>
> - Whatever deferred things Christoph keeps reminding us about.
>
> Honestly, I don't fully understand what all these hooks are supposed
> to do, nor do I care all that much. From my perspective, the code
> code should be able to do whatever it wants and rely on appropriate
> notifications from arch code. It would be great if we could come up
> with something straightforward that covers everything. For example:
>
> user_mode_to_kernel_mode()
> kernel_mode_to_user_mode()
> kernel_mode_to_guest_mode()
> in_a_periodic_tick()
> starting_nmi()
> ending_nmi()

kernel_mode_nonidle_to_idle()
kernel_mode_idle_to_nonidle()

> may_i_turn_off_ticks_right_now()

This is RCU if CONFIG_RCU_FAST_NO_HZ=n.

> or, better yet:
> i_am_turning_off_ticks_right_now_and_register_your_own_darned_hrtimer_if_thats_a_problem()

This is RCU if CONFIG_RCU_FAST_NO_HZ=y. It would not be difficult to
make RCU do this if CONFIG_RCU_FAST_NO_HZ=n as well, but doing so would
increase to/from idle overhead.

> Some arches may need:
>
> i_am_lame_and_forgot_my_previous_context()
>
> x86 will soon (4.3 or 4.4, depending on how my syscall cleanup goes)
> no longer need that.
>
> Paul says that some arches need something that goes straight from IRQ
> to user mode (?) -- sigh.

Straight from IRQ to process-level kernel mode. I ran into this in
late 2011, and clearly should have documented exactly what code was
doing this. Something about invoking system calls from within the
kernel on some architectures.

Hey, if no architectures do this anymore, I could simplify RCU a bit! ;-)

> etc.
>
> It might make sense to get enough people who understand what's going
> on behind the scenes together to hash out the requirements.

Please count me in!

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/