On Mon, 23 Jul 2018, David Rientjes wrote:
Measuring access latency to 4GB of memory on Naples I observe ~6.7%Not since we disable it :) I will, though. The more concerning issue forThe huge zero page can be reclaimed under memory pressure and, if it is,Have you benchmarked making the non-huge zero page per-node?
it is attempted to be allocted again with gfp flags that attempt memory
compaction that can become expensive. If we are constantly under memory
pressure, it gets freed and reallocated millions of times always trying to
compact memory both directly and by kicking kcompactd in the background.
It likely should also be per node.
us, modulo CVE-2017-1000405, is the cpu cost of constantly directly
compacting memory for allocating the hzp in real time after it has been
reclaimed. We've observed this happening tens or hundreds of thousands
of times on some systems. It will be 2MB per node on x86 if the data
suggests we should make it NUMA aware, I don't think the cost is too high
to leave it persistently available even under memory pressure if
use_zero_page is enabled.
slower access latency intrasocket and ~14% slower intersocket.
use_zero_page is currently a simple thp flag, meaning it rejects writes
where val != !!val, so perhaps it would be best to overload it with
additional options? I can imagine 0x2 defining persistent allocation so
that the hzp is not freed when the refcount goes to 0 and 0x4 defining if
the hzp should be per node. Implementing persistent allocation fixes our
concern with it, so I'd like to start there. Comments?