[PATCH] x86,numa: Pick up reserved memblock from SRAT entries

From: Fan Du
Date: Thu Jun 17 2021 - 21:37:41 EST


Spot below error message from 2 sockets server with SGX enabled:
[ 2.264955] sgx: EPC section 0x2000c00000-0x207f7fffff
[ 2.269093] sgx: EPC section 0x4000c00000-0x407fffffff
[ 2.273242] sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0.

SGX EPC ranges are reserved(E820) memory managed directly by SGX driver.
The second EPC section expected to be bound to NUMA node 1, while
phys_to_target_node failed to find a valid online NUMA node from this
address range.

Essentially it's not a firmware bug, the root cause is that the second EPC
section is arranged at the end of SRAT show as below, thus missed to be
picked up by numa_reserved_meminfo. Add additional check for such case.

[ 0.022842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff]
[ 0.022844] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x207fffffff]
[ 0.022846] ACPI: SRAT: Node 1 PXM 1 [mem 0x2080000000-0x407fffffff]

w/o this patch:
crash> numa_meminfo
numa_meminfo = $1 = {
nr_blks = 0x2,
blk = {{
start = 0x0,
end = 0x2080000000,
nid = 0x0
}, {
start = 0x2080000000,
end = 0x4000000000,
nid = 0x1
}, {
crash> numa_reserved_meminfo
numa_reserved_meminfo = $2 = {
nr_blks = 0x0,
blk = {{
start = 0x0,
end = 0x0,
nid = 0x0
},

w/ this patch:
crash> numa_meminfo
numa_meminfo = $1 = {
nr_blks = 0x2,
blk = {{
start = 0x0,
end = 0x2080000000,
nid = 0x0
}, {
start = 0x2080000000,
end = 0x4000000000,
nid = 0x1
},
crash> numa_reserved_meminfo
numa_reserved_meminfo = $2 = {
nr_blks = 0x1,
blk = {{
start = 0x4000000000,
end = 0x4080000000,
nid = 0x1
},

Signed-off-by: Fan Du <fan.du@xxxxxxxxx>
Reviewed-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx>
Fixes: 5d30f92e7631 ("x86/NUMA: Provide a range-to-target_node lookup facility")
---
arch/x86/mm/numa.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 5eb4dc2b97da..e23af389cad9 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -254,7 +254,12 @@ int __init numa_cleanup_meminfo(struct numa_meminfo *mi)

/* make sure all non-reserved blocks are inside the limits */
bi->start = max(bi->start, low);
- bi->end = min(bi->end, high);
+
+ /* handle reserved memory at the end of the SRAT range */
+ if (bi->end > high) {
+ numa_add_memblk_to(bi->nid, high, bi->end, &numa_reserved_meminfo);
+ bi->end = high;
+ }

/* and there's no empty block */
if (bi->start >= bi->end)
--
2.27.0