Re: [RFC][PATCH v2] sched/rt: Use IPI to trigger RT task push migration instead of pulling

From: Peter Zijlstra
Date: Thu Feb 26 2015 - 02:49:27 EST


On Wed, Feb 25, 2015 at 12:50:15PM -0500, Steven Rostedt wrote:
> > Well, the problem with it is one of collisions. So the 'easy' solution I
> > proposed would be something like:
> >
> > int ips_next(struct ipi_pull_struct *ips)
> > {
> > int cpu = ips->src_cpu;
> > cpu = cpumask_next(cpu, rto_mask);
> > if (cpu >= nr_cpu_ids) {
>
> Do we really need to loop? Just start with the first one, and go to the
> end.
>
> > cpu = 0;
> > ips->flags |= IPS_LOOPED;
> > cpu = cpumask_next(cpu, rto_mask);
> > if (cpu >= nr_cpu_ids) /* empty mask *;
> > return cpu;
> > }
> > if (ips->flags & IPS_LOOPED && cpu >= ips->stop_cpu)
> > return nr_cpu_ids;
> > return cpu;
> > }

Yes, notice that we don't start iterating at the beginning; this in on
purpose. If we start iterating at the beginning, _every_ cpu will again
pile up on the first one.

By starting at the current cpu, each cpu will start iteration some place
else and hopefully, with a big enough system, different CPUs end up on a
different rto cpu.

> >
> >
> > struct ipi_pull_struct *ips = __this_cpu_ptr(ips);
> >
> > raw_spin_lock(&ips->lock);
> > if (ips->flags & IPS_BUSY) {
> > /* there is an IPI active; update state */
> > ips->dst_prio = current->prio;
> > ips->stop_cpu = ips->src_cpu;
> > ips->flags &= ~IPS_LOOPED;
>
> I guess the loop is needed for continuing the work, in case the
> scheduling changed?

That too.

> > } else {
> > /* no IPI active, make one go */
> > ips->dst_cpu = smp_processor_id();
> > ips->dst_prio = current->prio;
> > ips->src_cpu = ips->dst_cpu;
> > ips->stop_cpu = ips->dst_cpu;
> > ips->flags = IPS_BUSY;
> >
> > cpu = ips_next(ips);
> > ips->src_cpu = cpu;
> > if (cpu < nr_cpu_ids)
> > irq_work_queue_on(&ips->work, cpu);
> > }
> > raw_spin_unlock(&ips->lock);
>
> I'll have to spend some time comprehending this.

:-)

> > Where you would simply start walking the RTO mask from the current
> > position -- it also includes some restart logic, and you'd only take
> > ips->lock when your ipi handler starts and when it needs to migrate to
> > another cpu.
> >
> > This way, on big systems, there's at least some chance different CPUs
> > find different targets to pull from.
>
> OK, makes sense. I can try that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/