Re: Use case for TASKS_RCU

From: Paul E. McKenney
Date: Fri May 19 2017 - 09:36:00 EST


On Fri, May 19, 2017 at 08:23:31AM +0200, Ingo Molnar wrote:
>
> * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > On Tue, May 16, 2017 at 08:22:33AM +0200, Ingo Molnar wrote:
> > >
> > > * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > > Hello!
> > > >
> > > > The question of the use case for TASKS_RCU came up, and here is my
> > > > understanding. Steve will not be shy about correcting any misconceptions
> > > > I might have. ;-)
> > > >
> > > > The use case is to support freeing of trampolines used in tracing/probing
> > > > in CONFIG_PREEMPT=y kernels. It is necessary to wait until any task
> > > > executing in the trampoline in question has left it, taking into account
> > > > that the trampoline's code might be interrupted and preempted. However,
> > > > the code in the trampolines is guaranteed never to context switch.
> > > >
> > > > Note that in CONFIG_PREEMPT=n kernels, synchronize_sched() suffices.
> > > > It is therefore tempting to think in terms of disabling preemption across
> > > > the trampolines, but there is apparently not enough room to accommodate
> > > > the needed preempt_disable() and preempt_enable() in the code invoking
> > > > the trampoline, and putting the preempt_disable() and preempt_enable()
> > > > in the trampoline itself fails because of the possibility of preemption
> > > > just before the preempt_disable() and just after the preempt_enable().
> > > > Similar reasoning rules out use of rcu_read_lock() and rcu_read_unlock().
> > >
> > > So how was this solved before TASKS_RCU? Also, nothing uses call_rcu_tasks() at
> > > the moment, so it's hard for me to review its users. What am I missing?
> >
> > Before TASKS_RCU, the trampolines were just leaked when CONFIG_PREEMPT=y.
> >
> > Current mainline kernel/trace/ftrace.c uses synchronize_rcu_tasks().
> > So yes, currently one user.
>
> So why not schedule a worklet on every CPU to drive the trampoline freeing? To
> guarantee that nothing was preempted it could run at SCHED_IDLE and could observe
> nr_running from the worklet and use a short timeout loop. Batching and hysteresis
> would ensure that this is only running rarely in practice.
>
> It doesn't have to be fast or particularly elegant, but it could use existing
> kernel facilites just fine: it's a corner case cost and quirk of our live kernel
> text modifying trampoline code and our current CONFIG_PREEMPT=y model.
>
> I.e. don't make it an RCU facility that complicates not just the RCU code but has
> various costs in generic code as well:
>
> kernel/exit.c: TASKS_RCU(int tasks_rcu_i);
> kernel/exit.c: TASKS_RCU(preempt_disable());
> kernel/exit.c: TASKS_RCU(tasks_rcu_i = __srcu_read_lock(&tasks_rcu_exit_srcu));
> kernel/exit.c: TASKS_RCU(preempt_enable());
> kernel/exit.c: TASKS_RCU(__srcu_read_unlock(&tasks_rcu_exit_srcu, tasks_rcu_i));
>
> I.e. I question that this should be a generic RCU facility.

Simpler would be better!

However, is it really guaranteed that one SCHED_IDLE thread cannot
preempt another? If not, then the trampoline-freeing SCHED_IDLE thread
might preempt some other SCHED_IDLE thread in the middle of a trampoline.
I am not seeing anything that prevents such preemption, but it is rather
early local time, so I could easily be missing something.

However, if SCHED_IDLE threads cannot preempt other threads, even other
SCHED_IDLE threads, then your approach sounds quite promising to me.

Steve, Peter, thoughts?

Thanx, Paul