Re: [PATCH 04/19] sched/numa: Set preferred_node based on best_cpu

From: Peter Zijlstra
Date: Mon Jun 04 2018 - 09:40:04 EST


On Mon, Jun 04, 2018 at 05:59:40AM -0700, Srikar Dronamraju wrote:
> * Peter Zijlstra <peterz@xxxxxxxxxxxxx> [2018-06-04 14:23:36]:
>
> > OK, the above matches the description, but I'm puzzled by the remainder:
> >
> > >
> > > - if (ng->active_nodes > 1 && numa_is_active_node(env.dst_nid, ng))
> > > - sched_setnuma(p, env.dst_nid);
> > > + if (nid != p->numa_preferred_nid)
> > > + sched_setnuma(p, nid);
> > > }
> >
> > That seems to entirely loose the active_node thing, or are you saying
> > best_cpu already includes that? (Changelog could use a little help there
> > I suppose)
>
> I think checking for active_nodes before calling sched_setnuma was a
> mistake.
>
> Before this change, we may be retaining numa_preferred_nid to be the
> source node while we select another node with better numa affinity to
> run on. So we are creating a situation where we force a thread to run on
> a node which is not going to be its preferred_node. So in the course of
> regular load balancing, this task might then be moved to set
> preferred_node which is actually not the preferred_node.

Then your Changelog had better explain all that, no?