Re: PROBLEM: Remapping hugepages mappings causes kernel to return EINVAL
From: C.Wehrmeyer
Date: Mon Oct 23 2017 - 08:23:21 EST
On 2017-10-23 13:42, Michal Hocko wrote:
I do not remember any such a request either. I can see some merit in the
described use case. It is not specific on why hugetlb pages are used for
the allocator memory because that comes with it own issues.
That is yet for the user to specify. As of now hugepages still require a
special setup that not all people might have as of now - to my knowledge
a kernel being compiled with CONFIG_TRANSPARENT_HUGEPAGE=y and a number
of such pages being allocated either through the kernel boot line or
through /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages. I'm
deliberately ignoring 1-GiB pages here because those are only
allocatable during boot, when no processes have been spawned and memory
is still not fragmented.
My point is that I can see people not being too eager to support 1 GiB
pages as of now unless for very specific use case. 2-MiB pages, on the
other hand, shouldn't have those limitations anymore. User-space
programs should be capable of allocating such pages without the need for
the user to fiddle with nr_hugepages beforehand.
Some time ago I've written some code to detect TLB capabilities on my
current testing CPU, those are the results:
[TLB] Instruction TLB: 2M/4M pages, fully associative, 8 entries
[TLB] Data TLB: 4 KByte pages, 4-way set associative, 64 entries
[TLB] Data TLB: 2 MByte or 4 MByte pages, 4-way set associative, 32
entries and a separate array with 1 GByte pages, 4-way set associative,
4 entries
[TLB] Instruction TLB: 4KByte pages, 8-way set associative, 64 entries
[STLB] Shared 2nd-Level TLB: 4 KByte/2MByte pages, 8-way associative,
1024 entries
With the knowledge that allocations in the Mebibyte range aren't
uncommon at all nowadays and that one 2-MiB page eliminates the need for
512 4-KiB pages, we really should make advances towards treating 2-MiB
pages just as casual as older pages. Allocators can still query if the
kernel supports the specified page size, and specifying MAP_HUGETLB |
MAP_HUGE_2MB would still be required in order to not break older
programs, but from my perspective there is a lot to gain here.