[PATCH v7 0/6] arm64: ptdump: View the second stage page-tables

From: Sebastian Ene
Date: Fri Jun 21 2024 - 08:33:01 EST


Hi,


This series extends the ptdump support to allow dumping the guest
stage-2 pagetables. When CONFIG_PTDUMP_STAGE2_DEBUGFS is enabled, ptdump
registers '/sys/debug/kvm/<guest_id>/stage2_page_tables' entry with debugfs
upon guest creation. This allows userspace tools (eg. cat) to dump the
stage-2 pagetables by reading the registered file.

Reading the debugfs file shows stage-2 memory ranges in following format:
<IPA range> <size> <descriptor type> <access permissions> <mem_attributes>

Below is the output of a guest stage-2 pagetable dump running under Qemu:

---[ IPA bits 33 start lvl 2 ]---
0x0000000000000000-0x0000000080000000 2G PGD
0x0000000080000000-0x0000000080c00000 12M PGD R W AF BLK
0x0000000080c00000-0x0000000080e00000 2M PGD XN R W AF BLK
0x0000000080e00000-0x0000000081000000 2M PGD R W AF BLK
0x0000000081000000-0x0000000081400000 4M PGD XN R W AF BLK
0x0000000081400000-0x000000008fe00000 234M PGD
0x000000008fe00000-0x0000000090000000 2M PGD XN R W AF BLK
0x0000000090000000-0x00000000fa000000 1696M PGD
0x00000000fa000000-0x00000000fe000000 64M PGD XN R W AF BLK
0x00000000fe000000-0x0000000100000000 32M PGD
0x0000000100000000-0x0000000101c00000 28M PGD XN R W AF BLK
0x0000000101c00000-0x0000000102000000 4M PGD
0x0000000102000000-0x0000000102200000 2M PGD XN R W AF BLK
0x0000000102200000-0x000000017b000000 1934M PGD
0x000000017b000000-0x0000000180000000 80M PGD XN R W AF BLK

Link to v6:
https://lore.kernel.org/all/20240220151035.327199-1-sebastianene@xxxxxxxxxx/

Link to v5:
https://lore.kernel.org/all/20240207144832.1017815-2-sebastianene@xxxxxxxxxx/

Link to v4:
https://lore.kernel.org/all/20231218135859.2513568-2-sebastianene@xxxxxxxxxx/

Link to v3:
https://lore.kernel.org/all/20231115171639.2852644-2-sebastianene@xxxxxxxxxx/

Changelog:
v6 -> v7:
* Reworded commit for this patch : [PATCH v6 2/6] arm64: ptdump: Expose
the attribute parsing functionality
* fixed minor conflicts in the struct pg_state definition
* moved the kvm_ptdump_guest_registration in the
* kvm_arch_create_vm_debugfs
* reset the parse state before walking the pagetables
* copy the level name to the pg_level buffer


v5 -> v6:
* don't return an error if the kvm_arch_create_vm_debugfs fails to
initialize (ref.
https://lore.kernel.org/all/20240216155941.2029458-1-oliver.upton@xxxxxxxxx/)
* fix use-after-free suggested by getting a reference to the
KVM struct while manipulating the debugfs files
and put the reference on the file close.
* do all the allocations at once for the ptdump parser state tracking
and simplify the initialization.
* move the ptdump parser state initialization as part of the file_open
* create separate files for printing the guest stage-2 pagetable
configuration such as: the start level of the pagetable walk and the
number of bits used for the IPA space representation.
* fixed the wrong header format for the newly added file
* include missing patch which hasn't been posted on the v5:
"KVM-arm64-Move-pagetable-definitions-to-common-heade.patch"


v4 -> v5:
* refactorization: split the series into two parts as per the feedback
received from Oliver. Introduce the base support which allows dumping
of the guest stage-2 pagetables.
* removed the *ops* struct wrapper built on top of the file_ops and
simplify the ptdump interface access.
* keep the page table walker away from the ptdump specific code

v3 -> current_version:
* refactorization: moved all the **KVM** specific components under
kvm/ as suggested by Oliver. Introduced a new file
arm64/kvm/ptdump.c which handled the second stage translation.
re-used only the display portion from mm/ptdump.c
* pagetable snapshot creation now uses memory donated from the host.
The memory is no longer shared with the host as this can pose a security
risk if the host has access to manipulate the pagetable copy while
the hypervisor iterates it.
* fixed a memory leak: while memory was used from the memcache for
building the snapshot pagetable, it was no longer giving back the
pages to the host for freeing. A separate array was introduced to
keep track of the pages allocated from the memcache.


v2 -> v3:
* register the stage-2 debugfs entry for the host under
/sys/debug/kvm/host_stage2_page_tables and in
/sys/debug/kvm/<guest_id>/stage2_page_tables for guests.
* don't use a static array for parsing the attributes description,
generate it dynamically based on the number of pagetable levels
* remove the lock that was guarding the seq_file private inode data,
and keep the data private to the open file session.
* minor fixes & renaming of CONFIG_NVHE_EL2_PTDUMP_DEBUGFS to
CONFIG_PTDUMP_STAGE2_DEBUGFS


v1 -> v2:
* use the stage-2 pagetable walker for dumping descriptors instead of
the one provided by ptdump.
* support for guests pagetables dumping under VHE/nVHE non-protected

Thanks,

Sebastian Ene (6):
KVM: arm64: Move pagetable definitions to common header
arm64: ptdump: Expose the attribute parsing functionality
arm64: ptdump: Use the mask from the state structure
KVM: arm64: Register ptdump with debugfs on guest creation
KVM: arm64: Initialize the ptdump parser with stage-2 attributes
KVM: arm64: Expose guest stage-2 pagetable config to debugfs

arch/arm64/include/asm/kvm_pgtable.h | 42 +++++
arch/arm64/include/asm/ptdump.h | 42 ++++-
arch/arm64/kvm/Kconfig | 14 ++
arch/arm64/kvm/Makefile | 1 +
arch/arm64/kvm/arm.c | 2 +
arch/arm64/kvm/hyp/pgtable.c | 42 -----
arch/arm64/kvm/kvm_ptdump.h | 20 ++
arch/arm64/kvm/ptdump.c | 272 +++++++++++++++++++++++++++
arch/arm64/mm/ptdump.c | 50 +----
9 files changed, 402 insertions(+), 83 deletions(-)
create mode 100644 arch/arm64/kvm/kvm_ptdump.h
create mode 100644 arch/arm64/kvm/ptdump.c

--
2.45.2.741.gdbec12cfda-goog