Re: [PATCH v2] sched/topology: Check average distances to remote packages

From: Tim Chen

Date: Thu Feb 26 2026 - 13:26:10 EST


On Wed, 2026-02-25 at 17:29 -0600, Kyle Meyer wrote:
> On Wed, Feb 25, 2026 at 02:55:58PM -0800, Tim Chen wrote:
> > On Wed, 2026-02-25 at 23:30 +0100, Peter Zijlstra wrote:
> > > On Wed, Feb 25, 2026 at 01:37:11PM -0800, Tim Chen wrote:
> > > > On Wed, 2026-02-25 at 17:32 +0100, Peter Zijlstra wrote:
> > > > > On Wed, Feb 25, 2026 at 04:44:09PM +0100, Peter Zijlstra wrote:
> > > > >
> > > > > > Yes, so this assumes that all u sized clusters on the trace are similar
> > > > > > and 'sane' without verification.
> > > > >
> > > > > That gave me an idea; how's this then?
> > > >
> > > > Sorry I was sick for a few days. Just catching up on this
> > > > thread here. I think your patch takes care of both GNR SNC-3 
> > > > with 3 compute dies (with non-symmetric remote
> > > > distances) and generic SNC-2 with 2 dies (symmetric
> > > > distances) very well.
> > > >
> > > > Minor suggestion below for the patch.
> > > >
> > > > Will ask the original GNR teams with the problem to try
> > > > it out.
> > >
> > > Since HPE can obviously have a sane SLIT table; why can't we simply
> > > claim the SLIT table they had is broken and needs fixing?
> >
> > From what I can see HPE seems to use SNC-2 variant of GNR so the SLIT
> > is symmetric.
>
> Yes, and the SKUs that don't support SNC.
>
> The SKUs that support SNC-3 are limited to 2 packages.

Yes, I think there are only 2 packages SNC-3 out there.

>
> > Unfortunately in the topology for the 2 socket GNR that has 3 dies, there
> > are truly unsymmetric paths from between die A to die B between remote
> > sockets from what I'm told.
>
> What does MLC look like?

I don't have access to one for measurements. Will have to ask colleagues
to measure that.

>
> > > Also, is there really no enumeration of the SNC mode available; must we
> > > really divinate?
> >
> > Let me dig into that a bit. I was also thinking with that information,
> > it will make the code a lot simpler.

There's truly no hardware bits to expose the SNC mode. Will have to
rely on snc_get_config() and use the ratio of number of CPUs in node vs L3
to get an estimate.

Tim