Re: [patch 1/2] nohz: only wakeup a single target cpu when kicking a task

From: Marcelo Tosatti
Date: Thu Oct 08 2020 - 14:05:28 EST


On Thu, Oct 08, 2020 at 02:22:56PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 07, 2020 at 03:01:52PM -0300, Marcelo Tosatti wrote:
> > When adding a tick dependency to a task, its necessary to
> > wakeup the CPU where the task resides to reevaluate tick
> > dependencies on that CPU.
> >
> > However the current code wakes up all nohz_full CPUs, which
> > is unnecessary.
> >
> > Switch to waking up a single CPU, by using ordering of writes
> > to task->cpu and task->tick_dep_mask.
> >
> > From: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> >
> > Index: linux-2.6/kernel/time/tick-sched.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/time/tick-sched.c
> > +++ linux-2.6/kernel/time/tick-sched.c
> > @@ -274,6 +274,31 @@ void tick_nohz_full_kick_cpu(int cpu)
> > irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
> > }
> >
> > +static void tick_nohz_kick_task(struct task_struct *tsk)
> > +{
> > + int cpu = task_cpu(tsk);
> > +
> > + /*
> > + * If the task concurrently migrates to another cpu,
> > + * we guarantee it sees the new tick dependency upon
> > + * schedule.
> > + *
> > + *
> > + * set_task_cpu(p, cpu);
> > + * STORE p->cpu = @cpu
> > + * __schedule() (switch to task 'p')
> > + * LOCK rq->lock
> > + * smp_mb__after_spin_lock() STORE p->tick_dep_mask
> > + * tick_nohz_task_switch() smp_mb() (atomic_fetch_or())
> > + * LOAD p->tick_dep_mask LOAD p->cpu
> > + */
> > +
> > + preempt_disable();
> > + if (cpu_online(cpu))
> > + tick_nohz_full_kick_cpu(cpu);
> > + preempt_enable();
> > +}
>
> So we need to kick the CPU unconditionally, or only when the task is
> actually running? AFAICT we only care about current->tick_dep_mask.

tick is necessary to execute run_posix_cpu_timers, from tick interrupt,
even if task is not running.