Hello,
sorry it took me so long to get back to this. I prepared the next version with
your comments applied and Tony's replies taken into account.
I wanted to briefly discuss this before posting:
On 2024-05-30 at 16:07:29 -0700, Reinette Chatre wrote:
On 5/15/24 4:18 AM, Maciej Wieczor-Retman wrote:
+ return 1;
+ }
+
+ for (i = 1; i <= MAX_SNC ; i++) {
+ if (i * node_cpus >= cache_cpus)
+ return i;
+ }
This is not obvious to me. From the function comments this seems to address the
scenarios when CPUs from other nodes are offline. It is not clear to me how
this loop addresses this. For example, let's say there are four SNC nodes
associated with a cache and only the node0 CPUs are online. The above would
detect this as "1", not "4", if I read this right?
I wonder if it may not be easier to just follow what the kernel does
(in the new version).
User space can learn the number of online and present CPUs from
/sys/devices/system/cpu/online and /sys/devices/system/cpu/present
respectively. A simple string compare of the contents can be used to
determine if they are identical and a warning can be printed if they are not.
With a warning when accurate detection cannot be done the simple
check will do.
Could you please add an informational message indicating how many SNC nodes
were indeed detected?
Should the information "how many SNC nodes are detected?" get printed every time
(by which I mean at the end of CMT and MBM tests) or only when we get the error
"SNC enabled but kernel doesn't support it" happens? Of course in the first case
if there is only 1 node detected nothing would be printed to avoid noise.