On Wed, Oct 14, 2020 at 01:32:55AM -0700, Ankur Arora wrote:That's fair. The reason for those weasel words is mostly because it
This can potentially improve page-clearing bandwidth (see below for
performance numbers for two microarchitectures where it helps and one
where it doesn't) and can help indirectly by consuming less cache
resources.
Any performance benefits are expected for extents larger than LLC-sized
or more -- when we are DRAM-BW constrained rather than cache-BW
constrained.
"potentially", "expected", I don't like those formulations.
Do you haveYes, guest creation under QEMU (pinned guests) shows similar improvements.
some actual benchmark data where this shows any improvement and not
microbenchmarks only, to warrant the additional complexity?