Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems

From: Yazen Ghannam
Date: Mon Sep 14 2020 - 15:20:58 EST


On Thu, Sep 10, 2020 at 12:14:43PM +0200, Borislav Petkov wrote:
> On Wed, Sep 09, 2020 at 03:17:55PM -0500, Yazen Ghannam wrote:
> > We need to access specific instances of hardware registers in the
> > Northbridge or Data Fabric. The code in arch/x86/kernel/amd_nb.c does
> > this.
>
> So you don't need the node_id - you need the northbridge/data fabric ID?
> I'm guessing NB == DF, i.e., it was NB before Zen and it is DF now.
>
> Yes?
>

Yes, that's right.

I called it "node_id" based on the AMD documentation and what it's
called today in the Linux code. It's called other things like nb_id and
nid too.

I think we can call it something else to avoid confusion with NUMA nodes
if that'll help.

> > Package = Socket, i.e. a field replaceable unit. Socket may not be
> > useful for software, but I think it helps users identify the hardware.
> >
> > I think the following could be changed in the documentation:
> >
> > "In the past a socket always contained a single package (see below), but
> > with the advent of Multi Chip Modules (MCM) a socket can hold more than one
> > package."
> >
> > Replace "package" with "die".
>
> So first of all, we have:
>
> "AMD nomenclature for package is 'Node'."
>
> so we either change that because as you explain, node != package on AMD.
>
> What you need is the ID of that northbridge or data fabric instance,
> AFAIU.
>
> > You take multiple dies from the foundry and you "package" them together
> > into a single unit.
>
> I think you're overloading the word "package" here and that leads to
> more confusion. Package in our definition - Linux' - is:
>
> "Packages contain a number of cores plus shared resources, e.g. DRAM
> controller, shared caches etc." If you glue several packages together,
> you get an MCM.
>

Yes, you're right. The AMD documentation is different, so I'll try to
stick with the Linux documentation and qualify names with "AMD" when
noting the usage by the AMD docs.

> > They could be equal depending on the system. The values are different on
> > MCM systems like Bulldozer and Naples though.
> >
> > The functions and structures in amd_nb.c are indexed by the node_id.
> > This is done implicitly right now by using amd_get_nb_id()/cpu_llc_id.
> > But the LLC isn't always equal to the Node/Die like in Naples. So the
> > patches in this set save and explicitly use the node_id when needed.
> >
> > What do you think?
>
> Sounds to me that you want to ID that data fabric instance which
> logically belongs to one or multiple packages. Or can a DF a single
> package?
>
> So let's start simple: how does a DF instance map to a logical NUMA
> node or package? Can a DF serve multiple packages?
>

There's one DF/NB per package and it's a fixed value, i.e. it shouldn't
change based on the NUMA configuration.

Here's an example of a 2 socket Naples system with 4 packages per socket
and setup to have 1 NUMA node. The "node_id" value is the AMD NodeId
from CPUID.

CPU=0 phys_proc_id=0 node_id=0 cpu_to_node()=0
CPU=8 phys_proc_id=0 node_id=1 cpu_to_node()=0
CPU=16 phys_proc_id=0 node_id=2 cpu_to_node()=0
CPU=24 phys_proc_id=0 node_id=3 cpu_to_node()=0
CPU=32 phys_proc_id=1 node_id=4 cpu_to_node()=0
CPU=40 phys_proc_id=1 node_id=5 cpu_to_node()=0
CPU=48 phys_proc_id=1 node_id=6 cpu_to_node()=0
CPU=56 phys_proc_id=1 node_id=7 cpu_to_node()=0

> You could use the examples at the end of Documentation/x86/topology.rst
> to explain how those things play together. And remember to not think
> about the physical aspect of the hardware structure because it doesn't
> mean anything to software. All you wanna do is address the proper DF
> instance so this needs to be enumerable and properly represented by sw.
>

Yeah, I think example 4b works here. The mismatch though is with
phys_proc_id and package on AMD systems. You can see above that
phys_proc_id gives a socket number, and the AMD NodeId gives a package
number.

Should we add a note under cpuinfo_x86.phys_proc_id to make this
distinction?

> Confused?
>
> I am.
>
> :-)
>

Yeah, me too. :)

Thanks,
Yazen