Re: [LSF/MM TOPIC] NUMA, memory hierarchy and device memory

From: Jerome Glisse
Date: Thu Apr 25 2019 - 16:16:30 EST



I see that the schedule is not full yet for the mm track and i would
really like to be able to have a discussion on this topic

Schedule:
https://docs.google.com/spreadsheets/d/1Z1pDL-XeUT1ZwMWrBL8T8q3vtSqZpLPgF3Bzu_jejfk/edit#gid=0


On Fri, Jan 18, 2019 at 12:45:13PM -0500, Jerome Glisse wrote:
> Hi, i would like to discuss about NUMA API and its short comings when
> it comes to memory hierarchy (from fast HBM, to slower persistent
> memory through regular memory) and also device memory (which can have
> its own hierarchy).
>
> I have proposed a patch to add a new memory topology model to the
> kernel for application to be able to get that informations, it
> also included a set of new API to bind/migrate process range [1].
> Note that this model also support device memory.
>
> So far device memory support is achieve through device specific ioctl
> and this forbid some scenario like device memory interleaving accross
> multiple devices for a range. It also make the whole userspace more
> complex as program have to mix and match multiple device specific API
> on top of NUMA API.
>
> While memory hierarchy can be more or less expose through the existing
> NUMA API by creating node for non-regular memory [2], i do not see this
> as a satisfying solution. Moreover such scheme does not work for device
> memory that might not even be accessible by CPUs.
>
>
> Hence i would like to discuss few points:
> - What proof people wants to see this as problem we need to solve ?
> - How to build concensus to move forward on this ?
> - What kind of syscall API people would like to see ?
>
> People to discuss this topic:
> Dan Williams <dan.j.williams@xxxxxxxxx>
> Dave Hansen <dave.hansen@xxxxxxxxx>
> Felix Kuehling <Felix.Kuehling@xxxxxxx>
> John Hubbard <jhubbard@xxxxxxxxxx>
> Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
> Keith Busch <keith.busch@xxxxxxxxx>
> Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> Michal Hocko <mhocko@xxxxxxxxxx>
> Paul Blinzer <Paul.Blinzer@xxxxxxx>
>
> Probably others, sorry if i miss anyone from previous discussions.
>
> Cheers,
> Jérôme
>
> [1] https://lkml.org/lkml/2018/12/3/1072
> [2] https://lkml.org/lkml/2018/12/10/1112
>