Re: [PATCH 6/6] sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine

From: Peter Zijlstra
Date: Tue Feb 13 2018 - 10:10:46 EST


On Tue, Feb 13, 2018 at 03:00:20PM +0000, Mel Gorman wrote:
> On Tue, Feb 13, 2018 at 03:43:26PM +0100, Peter Zijlstra wrote:
> > >
> > > Well, it was deliberate. While it's possible to be on the same memory
> > > node and not sharing cache, the scheduler typically is more concerned with
> > > the LLC than NUMA per-se. If they share LLC, then I also assume that they
> > > share memory locality.
> >
> > True, but the remaining code only has effect for numa balance, which is
> > concerned with nodes. So I don't see the point of using something
> > potentially smaller.
> >
> > Suppose someone did hardware where a node has 2 cache clusters, then
> > we'd still set a wake_affine back-off for numa-balance, even though it
> > remains on the same node.
> >
> > How would that be useful?
>
> Fair point, it could be unexpected from a NUMA balancing perspective and
> sub-numa clustering does exist so it's a potential issue. I'm happy to
> change it to cpu_to_node. I can resend the series if you prefer but feel
> free to change it in-place if you're picking it up. I do not expect any
> change on the machines I tested with as for all of them LLC was equivalent
> to checking the node ID.

OK, changed it locally. Thanks!