Re: [RFC][PATCH 0/8] sched,idle: need resched polling rework

From: Peter Zijlstra
Date: Tue Jun 03 2014 - 12:20:12 EST


On Tue, Jun 03, 2014 at 09:05:03AM -0700, Andy Lutomirski wrote:
> On Tue, Jun 3, 2014 at 7:02 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Tue, Jun 03, 2014 at 12:43:47PM +0200, Peter Zijlstra wrote:
> >> We need rq->curr, rq->idle 'sleeps' with polling set and nr clear, but
> >> it obviously has no effect setting that if its not actually the current
> >> task.
> >>
> >> Touching rq->curr needs holding rcu_read_lock() though, to make sure the
> >> task stays around, still shouldn't be a problem.
> >
> >> @@ -1581,8 +1604,14 @@ void scheduler_ipi(void)
> >>
> >> static void ttwu_queue_remote(struct task_struct *p, int cpu)
> >> {
> >> - if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list))
> >> - smp_send_reschedule(cpu);
> >> + struct rq *rq = cpu_rq(cpu);
> >> +
> >> + if (llist_add(&p->wake_entry, &rq->wake_list)) {
> >> + rcu_read_lock();
> >> + if (!set_nr_if_polling(rq->curr))
> >> + smp_send_reschedule(cpu);
> >> + rcu_read_unlock();
> >> + }
> >> }
> >
> > Hrmm, I think that is still broken, see how in schedule() we clear NR
> > before setting the new ->curr.
> >
> > So I think I had a loop on rq->curr the last time we talked about this,
> > but alternatively we could look at clearing NR after setting a new curr.
> >
> > I think I once looked at why it was done before, of course I can't
> > actually remember the details :/
>
> Wouldn't this be a little simpler and maybe even faster if we just
> changed the idle loop to make TIF_POLLING_NRFLAG be a real indication
> that the idle task is running and actively polling? That is, change
> the end of cpuidle_idle_loop to:
>
> preempt_set_need_resched();
> tick_nohz_idle_exit();
> clear_tsk_need_resched(current);
> __current_clr_polling();
> smp_mb__after_clear_bit();
> WARN_ON_ONCE(test_thread_flag(TIF_POLLING_NRFLAG));
> sched_ttwu_pending();
> schedule_preempt_disabled();
> __current_set_polling();
>
> This has the added benefit that the optimistic version of the cmpxchg
> loop would be safe again. I'm about to test this with this variant.
> I'll try and send a comprehensible set of patches in a few hours.
>
> Can you remind me what the benefit was of letting polling be set when
> the idle thread schedules?

Hysterical raisins, I don't think there's an actual reason, so yes, that
might be the best option indeed.

> It seems racy to me: it probably prevents
> any safe use of the polling bit without holding the rq lock. I guess
> there's some benefit to having polling be set for as long as possible,
> but it only helps if there are wakeups in very rapid succession, and
> it costs a couple of extra bit ops per idle entry.

So you could cheat and set it in pick_next_task_idle() and clear in
put_prev_task_idle(), that way the entire idle loop, when running has it
set.

And then there was the idle injection loop crap trainwreck, which I
should send patches for..

Attachment: pgp7AAGeF248Q.pgp
Description: PGP signature