Re: [PATCH 6/6] sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine

From: Mel Gorman
Date: Tue Feb 13 2018 - 10:00:27 EST


On Tue, Feb 13, 2018 at 03:43:26PM +0100, Peter Zijlstra wrote:
> >
> > Well, it was deliberate. While it's possible to be on the same memory
> > node and not sharing cache, the scheduler typically is more concerned with
> > the LLC than NUMA per-se. If they share LLC, then I also assume that they
> > share memory locality.
>
> True, but the remaining code only has effect for numa balance, which is
> concerned with nodes. So I don't see the point of using something
> potentially smaller.
>
> Suppose someone did hardware where a node has 2 cache clusters, then
> we'd still set a wake_affine back-off for numa-balance, even though it
> remains on the same node.
>
> How would that be useful?

Fair point, it could be unexpected from a NUMA balancing perspective and
sub-numa clustering does exist so it's a potential issue. I'm happy to
change it to cpu_to_node. I can resend the series if you prefer but feel
free to change it in-place if you're picking it up. I do not expect any
change on the machines I tested with as for all of them LLC was equivalent
to checking the node ID.

--
Mel Gorman
SUSE Labs