On 27.09.2015 [23:59:12 +0530], Raghavendra K T wrote:
Create arrays that maps serial nids and sparse chipids.
Note: My original idea had only two arrays of chipid to nid map. Final
code is inspired by driver/acpi/numa.c that maps a proximity node with
a logical node by Takayoshi Kochi <t-kochi@xxxxxxxxxxxxx>, and thus
uses an additional chipid_map nodemask. The mask helps in first unused
nid easily by knowing first unset bit in the mask.
No change in functionality.
Signed-off-by: Raghavendra K T <raghavendra.kt@xxxxxxxxxxxxxxxxxx>
---
arch/powerpc/mm/numa.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 47 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index dd2073b..f015cad 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -63,6 +63,11 @@ static int form1_affinity;
static int distance_ref_points_depth;
static const __be32 *distance_ref_points;
static int distance_lookup_table[MAX_NUMNODES][MAX_DISTANCE_REF_POINTS];
+static nodemask_t chipid_map = NODE_MASK_NONE;
+static int chipid_to_nid_map[MAX_NUMNODES]
+ = { [0 ... MAX_NUMNODES - 1] = NUMA_NO_NODE };
Hrm, conceptually there are *more* chips than nodes, right? So what
guarantees we won't see > MAX_NUMNODES chips?
+static int nid_to_chipid_map[MAX_NUMNODES]
+ = { [0 ... MAX_NUMNODES - 1] = NUMA_NO_NODE };
/*
* Allocate node_to_cpumask_map based on number of available nodes
@@ -133,6 +138,48 @@ static int __init fake_numa_create_new_node(unsigned long end_pfn,
return 0;
}
+int chipid_to_nid(int chipid)
+{
+ if (chipid < 0)
+ return NUMA_NO_NODE;
Do you really want to support these cases? Or should they be
bugs/warnings indicating that you got an unexpected input? Or at least
WARN_ON_ONCE?
+ return chipid_to_nid_map[chipid];
+}
+
+int nid_to_chipid(int nid)
+{
+ if (nid < 0)
+ return NUMA_NO_NODE;
+ return nid_to_chipid_map[nid];
+}
+
+static void __map_chipid_to_nid(int chipid, int nid)
+{
+ if (chipid_to_nid_map[chipid] == NUMA_NO_NODE
+ || nid < chipid_to_nid_map[chipid])
+ chipid_to_nid_map[chipid] = nid;
+ if (nid_to_chipid_map[nid] == NUMA_NO_NODE
+ || chipid < nid_to_chipid_map[nid])
+ nid_to_chipid_map[nid] = chipid;
+}
chip <-> node mapping is a static (physical) concept, right? Should we
emit some debugging if for some reason we get a runtime call to remap
an already mapped chip to a new node?
+
+int map_chipid_to_nid(int chipid)
+{
+ int nid;
+
+ if (chipid < 0 || chipid >= MAX_NUMNODES)
+ return NUMA_NO_NODE;
+
+ nid = chipid_to_nid_map[chipid];
+ if (nid == NUMA_NO_NODE) {
+ if (nodes_weight(chipid_map) >= MAX_NUMNODES)
+ return NUMA_NO_NODE;
If you create a KVM guest with a bogus topology, doesn't this just start
losing NUMA information for very high-noded guests?
+ nid = first_unset_node(chipid_map);
+ __map_chipid_to_nid(chipid, nid);
+ node_set(nid, chipid_map);
+ }
+ return nid;
+}
+
int numa_cpu_lookup(int cpu)
{
return numa_cpu_lookup_table[cpu];
@@ -264,7 +311,6 @@ out:
return chipid;
}
-
stray change?