Re: [RFC PATCH 00/14] Heterogeneous Memory System (HMS) and hbind()

From: Dave Hansen
Date: Tue Dec 04 2018 - 18:54:26 EST


On 12/3/18 3:34 PM, jglisse@xxxxxxxxxx wrote:
> This patchset use the above scheme to expose system topology through
> sysfs under /sys/bus/hms/ with:
> - /sys/bus/hms/devices/v%version-%id-target/ : a target memory,
> each has a UID and you can usual value in that folder (node id,
> size, ...)
>
> - /sys/bus/hms/devices/v%version-%id-initiator/ : an initiator
> (CPU or device), each has a HMS UID but also a CPU id for CPU
> (which match CPU id in (/sys/bus/cpu/). For device you have a
> path that can be PCIE BUS ID for instance)
>
> - /sys/bus/hms/devices/v%version-%id-link : an link, each has a
> UID and a file per property (bandwidth, latency, ...) you also
> find a symlink to every target and initiator connected to that
> link.
>
> - /sys/bus/hms/devices/v%version-%id-bridge : a bridge, each has
> a UID and a file per property (bandwidth, latency, ...) you
> also find a symlink to all initiators that can use that bridge.

We support 1024 NUMA nodes on x86. The ACPI HMAT expresses the
connections between each node. Let's suppose that each node has some
CPUs and some memory.

That means we'll have 1024 target directories in sysfs, 1024 initiator
directories in sysfs, and 1024*1024 link directories. Or, would the
kernel be responsible for "compiling" the firmware-provided information
down into a more manageable number of links?

Some idiot made the mistake of having one sysfs directory per 128MB of
memory way back when, and now we have hundreds of thousands of
/sys/devices/system/memory/memoryX directories. That sucks to manage.
Isn't this potentially repeating that mistake?

Basically, is sysfs the right place to even expose this much data?