Hugh Dickins wrote:
Let me say (while perhaps others are still reading) that I'm seriouslyHi,
wondering whether you should actually restrict your shared pagetable work
to the hugetlb case. I realize that would be a disappointing limitation
to you, and would remove the 25%/50% improvement cases, leaving only the
3%/4% last-ounce-of-performance cases.
But it's worrying me a lot that these complications to core mm code will
_almost_ never apply to the majority of users, will get little testing
outside of specialist setups. I'd feel safer to remove that "almost",
and consign shared pagetables to the hugetlb ghetto, if that would
indeed remove their handling from the common code paths. (Whereas,
if we didn't have hugetlb, I would be arguing strongly for shared pts.)
In the case of x86-64, if pagetable sharing for small pages was eliminated, we'd lose more than the 27-33% throughput improvement observed when the bufferpools are in small pages. We'd also lose a significant chunk of the 3% improvement observed when the bufferpools are in hugepages. This occurs because there is still small page pagetable sharing being achieved, minimally for database text, when the bufferpools are in hugepages. The performance counters indicated that ITLB and DTLB page walks were reduced by 28% and 10%, respectively, in the x86-64/hugepage case.
To be clear, all measurements discussed in my post were performed with kernels config'ed to share pagetables for both small pages and hugepages.
If we had to choose between pagetable sharing for small pages and hugepages, we would be in favor of retaining pagetable sharing for small pages. That is where the discernable benefit is for customers that run with "out-of-the-box" settings. Also, there is still some benefit there on x86-64 for customers that use hugepages for the bufferpools.