Re: [PATCH 18/18] sched: Swap tasks when reschuling if a CPU on atarget node is imbalanced

From: Mel Gorman
Date: Thu Aug 01 2013 - 11:49:10 EST


On Thu, Aug 01, 2013 at 10:29:58AM +0530, Srikar Dronamraju wrote:
> > @@ -904,6 +908,8 @@ static int task_numa_find_cpu(struct task_struct *p, int nid)
> > src_eff_load *= src_load + effective_load(tg, src_cpu, -weight, -weight);
> >
> > for_each_cpu(cpu, cpumask_of_node(nid)) {
> > + struct task_struct *swap_candidate = NULL;
> > +
> > dst_load = target_load(cpu, idx);
> >
> > /* If the CPU is idle, use it */
> > @@ -922,12 +928,41 @@ static int task_numa_find_cpu(struct task_struct *p, int nid)
> > * migrate to its preferred node due to load imbalances.
> > */
> > balanced = (dst_eff_load <= src_eff_load);
> > - if (!balanced)
> > - continue;
> > + if (!balanced) {
> > + struct rq *rq = cpu_rq(cpu);
> > + unsigned long src_faults, dst_faults;
> > +
> > + /* Do not move tasks off their preferred node */
> > + if (rq->curr->numa_preferred_nid == nid)
> > + continue;
> > +
> > + /* Do not attempt an illegal migration */
> > + if (!cpumask_test_cpu(cpu, tsk_cpus_allowed(rq->curr)))
> > + continue;
> > +
> > + /*
> > + * Do not impair locality for the swap candidate.
> > + * Destination for the swap candidate is the source cpu
> > + */
> > + if (rq->curr->numa_faults) {
> > + src_faults = rq->curr->numa_faults[task_faults_idx(nid, 1)];
> > + dst_faults = rq->curr->numa_faults[task_faults_idx(src_cpu_node, 1)];
> > + if (src_faults > dst_faults)
> > + continue;
> > + }
> > +
> > + /*
> > + * The destination is overloaded but running a task
> > + * that is not running on its preferred node. Consider
> > + * swapping the CPU tasks are running on.
> > + */
> > + swap_candidate = rq->curr;
> > + }
> >
> > if (dst_load < min_load) {
> > min_load = dst_load;
> > dst_cpu = cpu;
> > + *swap_p = swap_candidate;
>
> Are we some times passing a wrong candidate?
> Lets say the first cpu balanced is false and we set the swap_candidate,
> but find the second cpu(/or later cpus) to be idle or has lesser effective load, then we
> could be sending the task that is running on the first cpu as the swap
> candidate.

Then at the second or later CPU swap_candidate == NULL so swap_p is
cleared too.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/