Re: [PATCH v6 1/5] x86/amd_nb: Add support for northbridges on Aldebaran
From: Borislav Petkov
Date: Tue Nov 09 2021 - 15:41:15 EST
On Tue, Nov 09, 2021 at 05:00:11PM +0530, Chatradhi, Naveen Krishna wrote:
> I was trying to handle both cpu and cpu northbridge enumeration in the
> amd_cache_northbridges() itself by reusing the existing structures and APIs.
>
> Should have seen this through more clearly. As, this is working well for the
> following reasons.
Good, that's exactly what I meant! :)
> a. Allocating the amd_northbridges.nb after identifying both the cpu and gpu
> misc devices, would extend node_to_amd_nb(node) for both cpu and gpu nodes.
Well, there's a reason those things are functions - so that you can do
the necessary computation inside them and when stuff needs to change,
users don't have to. That's why amd_northbridges is static and only the
functions are exported.
So if you want to make sure node_to_amd_nb() works for GPU nodes too,
you simply have to look at the right "container" so to speak, depending
on the number @node passed in as an argument and look it up in the
proper array of pointers:
[ CPU_NB0, CPU_NB1, ..., CPU_NB-N] [ GPU_NB0, ... ]
you get the idea.
> It is used extensively in this module. However, the roots_per_misc value
> is different in case of cpus and gpus and that needed to be handled
> seperately.
>
> b. amd_nb_num(void) is used by other modules in the kernel, returning the
> total count of CPU and GPU northbridges would break the existing code.
amd64_edac is using it too, to know how many driver instances to allocate.
The other users - amd_gart_64.c and amd64-agp.c - are old stuff so they
most certainly don't need to get the number of GPU nodes too.
> I understood your point now.
Good.
> When we create separate functions for caching cpu and gpu devices, is it
> okay to create "struct amd_gpu_nb_info" with the following fields
>
> a. gpu_num;
> b. struct amd_northbridge *gpu_nb;
> c. gpu_node_start_id;
Makes sense. You need to put in it anything that describes the GPU NBs
on the system and that other code would use.
> While, amd_nb_num(), continues to return number of cpu NBs
> Add new API amd_gpu_nb_num(), return number of gpu NBs
>
> and modify the node_to_amd_nb(node) to extend the same behavior for gpu
> devices also
Yap, exactly.
And since amd64_edac is going to need the full NB count, you can use
there both:
num_nodes = amd_nb_num() + amd_gpu_nb_num();
...
Easy, peasy. :-)
Thanks!
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette