Nick Piggin wrote:
Of course if it was free performance then we'd want it. The downsides are that it
is a significant complexity for a pretty small (3%) performance gain for your apparent
target workload, which is pretty uncommon among all Linux users.
Our performance data demonstrated that the potential gain for the non-hugepage case is much higher than 3%.
Ignoring the complexity, it is still not free. Sharing data across processes adds to
synchronisation overhead and hurts scalability. Some of these page fault scalability
scenarios have shown to be important enough that we have introduced complexity _there_.
True, but this needs to be balanced against the fact that pagetable sharing will reduce the number of page faults when it is achieved. Let's say you have N processes which touch all the pages in an M page shared memory region. Without shared pagetables this requires N*M page faults; if pagetable sharing is achieved, only M pagefaults are required.
And it seems customers running "out-of-the-box" settings really want to start using
hugepages if they're interested in getting the most performance possible, no?
My perspective is that, once the customer is required to invoke "echo XXX > /proc/sys/vm/nr_hugepages" they've left the "out-of-the-box" domain, and entered the domain of hoping that the number of hugepages is sufficient, because if it's not, they'll probably need to reboot, which can be pretty inconvenient for a production transaction-processing application.