On 9/14/2018 1:56 AM, Michal Hocko wrote:
On Thu 13-09-18 15:32:25, prakash.sangappa wrote:That works for skipping holes, but not for skipping huge pages. I did a
Do you have any numbers?
The proc interface provides an efficient way to export address range
to numa node id mapping information compared to using the API.
For example, for sparsely populated mappings, if a VMA has large portionsWhat prevents you from pre-filtering by reading /proc/$pid/maps to get
not have any physical pages mapped, the page walk done thru the /proc file
interface can skip over non existent PMDs / ptes. Whereas using the
API the application would have to scan the entire VMA in page size units.
ranges of interest?
quick experiment to time move_pages on a 3 GHz Xeon and a 4.18 kernel.
Allocate 128 GB and touch every small page. Call move_pages with nodes=NULL
to get the node id for all pages, passing 512 consecutive small pages per
call to move_nodes. The total move_nodes time is 1.85 secs, and 55 nsec
per page. Extrapolating to a 1 TB range, it would take 15 sec to retrieve
the numa node for every small page in the range. That is not terrible, but
it is not interactive, and it becomes terrible for multiple TB.