Re: [PATCH RFC/TEST] sched: make sync affine wakeups work

From: Peter Zijlstra
Date: Tue May 06 2014 - 16:41:28 EST


On Tue, May 06, 2014 at 04:20:59PM -0400, Rik van Riel wrote:
> On 05/06/2014 09:25 AM, Peter Zijlstra wrote:
> > On Sun, May 04, 2014 at 08:41:09AM -0400, Rik van Riel wrote:
> >> Even on 8-node DL980 systems, the NUMA distance in the
> >> SLIT table is less than RECLAIM_DISTANCE, and we will
> >> do wake_affine across the entire system.
> >
> > Yeah, so the problem is that (AFAIK) ACPI doesn't actually specify a
> > metric for the SLIT distance. This (in as far as BIOS people would care
> > to stick to specs anyhow) has lead to the 'fun' situation where BIOS
> > engineers tweak SLIT table values to make OSes behave as they thing it
> > should.
> >
> > So if the BIOS engineer finds that this system should have <
> > RECLAIM_DISTANCE it will simply make the table such that the max SLIT
> > value is below that.
> >
> > And yes, I've seen this :-(
>
> It appears to be the case on the vast majority of the NUMA
> systems that are actually in use.
>
> To me, this suggests that we should probably deal with it.

What we could do is redefine this distance in hops, that'll force them
to lie more blatantly and actually miss represent the topology.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/