[PATCH v5 0/4] KVM: mm: count KVM mmu usage in memory stats

From: Yosry Ahmed
Date: Mon Jun 06 2022 - 18:21:13 EST


We keep track of several kernel memory stats (total kernel memory, page
tables, stack, vmalloc, etc) on multiple levels (global, per-node,
per-memcg, etc). These stats give insights to users to how much memory
is used by the kernel and for what purposes.

Currently, memory used by kvm mmu is not accounted in any of those
kernel memory stats. This patch series accounts the memory pages
used by KVM for page tables in those stats in a new
NR_SECONDARY_PAGETABLE stat. This stat can be later extended to account
for other types of secondary pages tables (e.g. iommu page tables).

KVM has a decent number of large allocations that aren't for page
tables, but for most of them, the number/size of those allocations
scales linearly with either the number of vCPUs or the amount of memory
assigned to the VM. KVM's secondary page table allocations do not scale
linearly, especially when nested virtualization is in use.

>From a KVM perspective, NR_SECONDARY_PAGETABLE will scale with KVM's
per-VM pages_{4k,2m,1g} stats unless the guest is doing something
bizarre (e.g. accessing only 4kb chunks of 2mb pages so that KVM is
forced to allocate a large number of page tables even though the guest
isn't accessing that much memory). However, someone would need to either
understand how KVM works to make that connection, or know (or be told) to
go look at KVM's stats if they're running VMs to better decipher the stats.

Also, having NR_PAGETABLE side-by-side with NR_SECONDARY_PAGETABLE is
informative. For example, when backing a VM with THP vs. HugeTLB,
NR_SECONDARY_PAGETABLE is roughly the same, but NR_PAGETABLE is an order
of magnitude higher with THP. So having this stat will at the very least
prove to be useful for understanding tradeoffs between VM backing types,
and likely even steer folks towards potential optimizations.

---

Chnages in V5:
- Updated cover letter to explain more the rationale behind the change
(Thanks to contributions by Sean Christopherson).
- Removed extraneous + in arm64 patch (Oliver Upton, Marc Zyngier).
- Shortened secondary_pagetables to sec_pagetables (Shakeel Butt).
- Removed dependency on other patchsets (applies to queue branch).

Changes in V4:
- Changed accounting hooks in arm64 to only account s2 page tables and
refactored them to a much cleaner form, based on recommendations from
Oliver Upton and Marc Zyngier.
- Dropped patches for mips and riscv. I am not interested in those archs
anyway and don't have the resources to test them. I posted them for
completeness but it doesn't seem like anyone was interested.

Changes in V3:
- Added NR_SECONDARY_PAGETABLE instead of piggybacking on NR_PAGETABLE
stats.

Changes in V2:
- Added accounting stats for other archs than x86.
- Changed locations in the code where x86 KVM page table stats were
accounted based on suggestions from Sean Christopherson.

---

Yosry Ahmed (4):
mm: add NR_SECONDARY_PAGETABLE to count secondary page table uses.
KVM: mmu: add a helper to account memory used by KVM MMU.
KVM: x86/mmu: count KVM mmu usage in secondary pagetable stats.
KVM: arm64/mmu: count KVM s2 mmu usage in secondary pagetable stats

Documentation/admin-guide/cgroup-v2.rst | 5 ++++
Documentation/filesystems/proc.rst | 4 +++
arch/arm64/kvm/mmu.c | 35 ++++++++++++++++++++++---
arch/x86/kvm/mmu/mmu.c | 16 +++++++++--
arch/x86/kvm/mmu/tdp_mmu.c | 12 +++++++++
drivers/base/node.c | 2 ++
fs/proc/meminfo.c | 2 ++
include/linux/kvm_host.h | 9 +++++++
include/linux/mmzone.h | 1 +
mm/memcontrol.c | 1 +
mm/page_alloc.c | 6 ++++-
mm/vmstat.c | 1 +
12 files changed, 87 insertions(+), 7 deletions(-)

--
2.36.1.255.ge46751e96f-goog