Re: [patch 1/2] nohz: only wakeup a single target cpu when kicking a task

From: Frederic Weisbecker
Date: Thu Oct 08 2020 - 15:54:49 EST


On Thu, Oct 08, 2020 at 02:54:09PM -0300, Marcelo Tosatti wrote:
> On Thu, Oct 08, 2020 at 02:22:56PM +0200, Peter Zijlstra wrote:
> > On Wed, Oct 07, 2020 at 03:01:52PM -0300, Marcelo Tosatti wrote:
> > > When adding a tick dependency to a task, its necessary to
> > > wakeup the CPU where the task resides to reevaluate tick
> > > dependencies on that CPU.
> > >
> > > However the current code wakes up all nohz_full CPUs, which
> > > is unnecessary.
> > >
> > > Switch to waking up a single CPU, by using ordering of writes
> > > to task->cpu and task->tick_dep_mask.
> > >
> > > From: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > > Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > > Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> > >
> > > Index: linux-2.6/kernel/time/tick-sched.c
> > > ===================================================================
> > > --- linux-2.6.orig/kernel/time/tick-sched.c
> > > +++ linux-2.6/kernel/time/tick-sched.c
> > > @@ -274,6 +274,31 @@ void tick_nohz_full_kick_cpu(int cpu)
> > > irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
> > > }
> > >
> > > +static void tick_nohz_kick_task(struct task_struct *tsk)
> > > +{
> > > + int cpu = task_cpu(tsk);
> > > +
> > > + /*
> > > + * If the task concurrently migrates to another cpu,
> > > + * we guarantee it sees the new tick dependency upon
> > > + * schedule.
> > > + *
> > > + *
> > > + * set_task_cpu(p, cpu);
> > > + * STORE p->cpu = @cpu
> > > + * __schedule() (switch to task 'p')
> > > + * LOCK rq->lock
> > > + * smp_mb__after_spin_lock() STORE p->tick_dep_mask
> > > + * tick_nohz_task_switch() smp_mb() (atomic_fetch_or())
> > > + * LOAD p->tick_dep_mask LOAD p->cpu
> > > + */
> > > +
> > > + preempt_disable();
> > > + if (cpu_online(cpu))
> > > + tick_nohz_full_kick_cpu(cpu);
> > > + preempt_enable();
> > > +}
> >
> > So we need to kick the CPU unconditionally, or only when the task is
> > actually running? AFAICT we only care about current->tick_dep_mask.
>
> tick is necessary to execute run_posix_cpu_timers, from tick interrupt,
> even if task is not running.

Yes but if the task isn't running, run_posix_cpu_timers() doesn't have
anything to elapse. So indeed we can spare the IPI if the task is not
running. Provided ordering makes sure that the task sees the new dependency
when it schedules in of course.