[GIT PULL] First batch of KVM changes for Linux 6.13 merge window

From: Paolo Bonzini
Date: Wed Nov 20 2024 - 09:00:22 EST


Linus,

The following changes since commit 2d5404caa8c7bb5c4e0435f94b28834ae5456623:

Linux 6.12-rc7 (2024-11-10 14:19:35 -0800)

are available in the Git repository at:

https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus

for you to fetch changes up to 9ee62c33c0fe017ee02501a877f6f562363122fa:

KVM: x86: Break CONFIG_KVM_X86's direct dependency on KVM_INTEL || KVM_AMD (2024-11-19 19:34:51 -0500)

The only conflict is in arm64's sysreg tables and is between commit:

034993461890 ("arm64/sysreg: Update ID_AA64MMFR1_EL1 to DDI0601 2024-09")

from the arm64 tree and commit:

9ae424d2a1ae ("arm64: Define ID_AA64MMFR1_EL1.HAFDBS advertising FEAT_HAFT")

from the kvm-arm tree. The arm64 version is a superset of the KVM version.
There are also some arm64 changes such as commit 09e6b306f3ba ("arm64:
cpufeature: discover CPU support for MPAM") with acks from the maintainers.

Some RISC-V changes are missing and should come next week.

Thanks,

Paolo

----------------------------------------------------------------
The biggest change here is eliminating the awful idea that KVM had, of
essentially guessing which pfns are refcounted pages. The reason to
do so was that KVM needs to map both non-refcounted pages (for example
BARs of VFIO devices) and VM_PFNMAP/VM_MIXMEDMAP VMAs that contain
refcounted pages. However, the result was security issues in the past,
and more recently the inability to map VM_IO and VM_PFNMAP memory
that _is_ backed by struct page but is not refcounted. In particular
this broke virtio-gpu blob resources (which directly map host graphics
buffers into the guest as "vram" for the virtio-gpu device) with the
amdgpu driver, because amdgpu allocates non-compound higher order pages
and the tail pages could not be mapped into KVM.

This requires adjusting all uses of struct page in the per-architecture
code, to always work on the pfn whenever possible. The large series that
did this, from David Stevens and Sean Christopherson, also cleaned up
substantially the set of functions that provided arch code with the
pfn for a host virtual addresses. The previous maze of twisty little
passages, all different, is replaced by five functions (__gfn_to_page,
__kvm_faultin_pfn, the non-__ versions of these two, and kvm_prefetch_pages)
saving almost 200 lines of code.

ARM:

* Support for stage-1 permission indirection (FEAT_S1PIE) and
permission overlays (FEAT_S1POE), including nested virt + the
emulated page table walker

* Introduce PSCI SYSTEM_OFF2 support to KVM + client driver. This call
was introduced in PSCIv1.3 as a mechanism to request hibernation,
similar to the S4 state in ACPI

* Explicitly trap + hide FEAT_MPAM (QoS controls) from KVM guests. As
part of it, introduce trivial initialization of the host's MPAM
context so KVM can use the corresponding traps

* PMU support under nested virtualization, honoring the guest
hypervisor's trap configuration and event filtering when running a
nested guest

* Fixes to vgic ITS serialization where stale device/interrupt table
entries are not zeroed when the mapping is invalidated by the VM

* Avoid emulated MMIO completion if userspace has requested synchronous
external abort injection

* Various fixes and cleanups affecting pKVM, vCPU initialization, and
selftests

LoongArch:

* Add iocsr and mmio bus simulation in kernel.

* Add in-kernel interrupt controller emulation.

* Add support for virtualization extensions to the eiointc irqchip.

PPC:

* Drop lingering and utterly obsolete references to PPC970 KVM, which was
removed 10 years ago.

* Fix incorrect documentation references to non-existing ioctls

RISC-V:

* Accelerate KVM RISC-V when running as a guest

* Perf support to collect KVM guest statistics from host side

s390:

* New selftests: more ucontrol selftests and CPU model sanity checks

* Support for the gen17 CPU model

* List registers supported by KVM_GET/SET_ONE_REG in the documentation

x86:

* Cleanup KVM's handling of Accessed and Dirty bits to dedup code, improve
documentation, harden against unexpected changes. Even if the hardware
A/D tracking is disabled, it is possible to use the hardware-defined A/D
bits to track if a PFN is Accessed and/or Dirty, and that removes a lot
of special cases.

* Elide TLB flushes when aging secondary PTEs, as has been done in x86's
primary MMU for over 10 years.

* Recover huge pages in-place in the TDP MMU when dirty page logging is
toggled off, instead of zapping them and waiting until the page is
re-accessed to create a huge mapping. This reduces vCPU jitter.

* Batch TLB flushes when dirty page logging is toggled off. This reduces
the time it takes to disable dirty logging by ~3x.

* Remove the shrinker that was (poorly) attempting to reclaim shadow page
tables in low-memory situations.

* Clean up and optimize KVM's handling of writes to MSR_IA32_APICBASE.

* Advertise CPUIDs for new instructions in Clearwater Forest

* Quirk KVM's misguided behavior of initialized certain feature MSRs to
their maximum supported feature set, which can result in KVM creating
invalid vCPU state. E.g. initializing PERF_CAPABILITIES to a non-zero
value results in the vCPU having invalid state if userspace hides PDCM
from the guest, which in turn can lead to save/restore failures.

* Fix KVM's handling of non-canonical checks for vCPUs that support LA57
to better follow the "architecture", in quotes because the actual
behavior is poorly documented. E.g. most MSR writes and descriptor
table loads ignore CR4.LA57 and operate purely on whether the CPU
supports LA57.

* Bypass the register cache when querying CPL from kvm_sched_out(), as
filling the cache from IRQ context is generally unsafe; harden the
cache accessors to try to prevent similar issues from occuring in the
future. The issue that triggered this change was already fixed in 6.12,
but was still kinda latent.

* Advertise AMD_IBPB_RET to userspace, and fix a related bug where KVM
over-advertises SPEC_CTRL when trying to support cross-vendor VMs.

* Minor cleanups

* Switch hugepage recovery thread to use vhost_task. These kthreads can
consume significant amounts of CPU time on behalf of a VM or in response
to how the VM behaves (for example how it accesses its memory); therefore
KVM tried to place the thread in the VM's cgroups and charge the CPU
time consumed by that work to the VM's container. However the kthreads
did not process SIGSTOP/SIGCONT, and therefore cgroups which had KVM
instances inside could not complete freezing. Fix this by replacing the
kthread with a PF_USER_WORKER thread, via the vhost_task abstraction.
Another 100+ lines removed, with generally better behavior too like
having these threads properly parented in the process tree.

* Revert a workaround for an old CPU erratum (Nehalem/Westmere) that didn't
really work; there was really nothing to work around anyway: the broken
patch was meant to fix nested virtualization, but the PERF_GLOBAL_CTRL
MSR is virtualized and therefore unaffected by the erratum.

* Fix 6.12 regression where CONFIG_KVM will be built as a module even
if asked to be builtin, as long as neither KVM_INTEL nor KVM_AMD is 'y'.

x86 selftests:

* x86 selftests can now use AVX.

Documentation:

* Use rST internal links

* Reorganize the introduction to the API document

Generic:

* Protect vcpu->pid accesses outside of vcpu->mutex with a rwlock instead
of RCU, so that running a vCPU on a different task doesn't encounter long
due to having to wait for all CPUs become quiescent. In general both reads
and writes are rare, but userspace that supports confidential computing is
introducing the use of "helper" vCPUs that may jump from one host processor
to another. Those will be very happy to trigger a synchronize_rcu(), and
the effect on performance is quite the disaster.

----------------------------------------------------------------
Anup Patel (13):
RISC-V: KVM: Order the object files alphabetically
RISC-V: KVM: Save/restore HSTATUS in C source
RISC-V: KVM: Save/restore SCOUNTEREN in C source
RISC-V: KVM: Break down the __kvm_riscv_switch_to() into macros
RISC-V: KVM: Replace aia_set_hvictl() with aia_hvictl_value()
RISC-V: KVM: Don't setup SGEI for zero guest external interrupts
RISC-V: Add defines for the SBI nested acceleration extension
RISC-V: KVM: Add common nested acceleration support
RISC-V: KVM: Use nacl_csr_xyz() for accessing H-extension CSRs
RISC-V: KVM: Use nacl_csr_xyz() for accessing AIA CSRs
RISC-V: KVM: Use SBI sync SRET call when available
RISC-V: KVM: Save trap CSRs in kvm_riscv_vcpu_enter_exit()
RISC-V: KVM: Use NACL HFENCEs for KVM request based HFENCEs

Arnd Bergmann (1):
KVM: x86: add back X86_LOCAL_APIC dependency

Ba Jing (1):
KVM: selftests: Remove unused macro in the hardware disable test

Bibo Mao (1):
irqchip/loongson-eiointc: Add virt extension support

Björn Töpel (1):
riscv: kvm: Fix out-of-bounds array access

Christoph Schlameuss (5):
KVM: s390: selftests: Add uc_map_unmap VM test case
KVM: s390: selftests: Add uc_skey VM test case
KVM: s390: selftests: Verify reject memory region operations for ucontrol VMs
KVM: s390: selftests: Fix whitespace confusion in ucontrol test
KVM: s390: selftests: correct IP.b length in uc_handle_sieic debug output

David Matlack (5):
KVM: x86/mmu: Drop @max_level from kvm_mmu_max_mapping_level()
KVM: x86/mmu: Batch TLB flushes when zapping collapsible TDP MMU SPTEs
KVM: x86/mmu: Recover TDP MMU huge page mappings in-place instead of zapping
KVM: x86/mmu: Rename make_huge_page_split_spte() to make_small_spte()
KVM: x86/mmu: WARN if huge page recovery triggered during dirty logging

David Stevens (3):
KVM: Replace "async" pointer in gfn=>pfn with "no_wait" and error code
KVM: Introduce kvm_follow_pfn() to eventually replace "gfn_to_pfn" APIs
KVM: Migrate kvm_vcpu_map() to kvm_follow_pfn()

David Woodhouse (6):
firmware/psci: Add definitions for PSCI v1.3 specification
KVM: arm64: Add PSCI v1.3 SYSTEM_OFF2 function for hibernation
KVM: arm64: Add support for PSCI v1.2 and v1.3
KVM: selftests: Add test for PSCI SYSTEM_OFF2
KVM: arm64: nvhe: Pass through PSCI v1.3 SYSTEM_OFF2 call
arm64: Use SYSTEM_OFF2 PSCI call to power off for hibernate

Fuad Tabba (4):
KVM: arm64: Move pkvm_vcpu_init_traps() to init_pkvm_hyp_vcpu()
KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use
KVM: arm64: Initialize the hypervisor's VM state at EL2
KVM: arm64: Initialize trap register values in hyp in pKVM

Hariharan Mari (5):
KVM: s390: selftests: Add regression tests for SORTL and DFLTCC CPU subfunctions
KVM: s390: selftests: Add regression tests for PRNO, KDSA and KMA crypto subfunctions
KVM: s390: selftests: Add regression tests for KMCTR, KMF, KMO and PCC crypto subfunctions
KVM: s390: selftests: Add regression tests for KMAC, KMC, KM, KIMD and KLMD crypto subfunctions
KVM: s390: selftests: Add regression tests for PLO subfunctions

Hendrik Brueckner (4):
KVM: s390: add concurrent-function facility to cpu model
KVM: s390: add msa11 to cpu model
KVM: s390: add gen17 facilities to CPU model
KVM: s390: selftests: Add regression tests for PFCR subfunctions

James Clark (1):
KVM: arm64: Pass on SVE mapping failures

James Morse (7):
arm64/sysreg: Convert existing MPAM sysregs and add the remaining entries
arm64: head.S: Initialise MPAM EL2 registers and disable traps
arm64: cpufeature: discover CPU support for MPAM
KVM: arm64: Fix missing traps of guest accesses to the MPAM registers
KVM: arm64: Add a macro for creating filtered sys_reg_descs entries
KVM: arm64: Disable MPAM visibility by default and ignore VMM writes
KVM: arm64: selftests: Test ID_AA64PFR0.MPAM isn't completely ignored

Jiapeng Chong (1):
KVM: selftests: Use ARRAY_SIZE for array length

Jim Mattson (2):
KVM: x86: Advertise AMD_IBPB_RET to userspace
KVM: x86: AMD's IBPB is not equivalent to Intel's IBPB

Jing Zhang (1):
KVM: arm64: vgic-its: Add a data length check in vgic_its_save_*

Kai Huang (2):
KVM: x86: Fix a comment inside kvm_vcpu_update_apicv()
KVM: x86: Fix a comment inside __kvm_set_or_clear_apicv_inhibit()

Kunkun Jiang (2):
KVM: arm64: vgic-its: Clear DTE when MAPD unmaps a device
KVM: arm64: vgic-its: Clear ITE when DISCARD frees an ITE

Marc Zyngier (33):
arm64: Drop SKL0/SKL1 from TCR2_EL2
arm64: Remove VNCR definition for PIRE0_EL2
arm64: Add encoding for PIRE0_EL2
KVM: arm64: Drop useless struct s2_mmu in __kvm_at_s1e2()
KVM: arm64: nv: Add missing EL2->EL1 mappings in get_el2_to_el1_mapping()
KVM: arm64: nv: Handle CNTHCTL_EL2 specially
arm64: Define ID_AA64MMFR1_EL1.HAFDBS advertising FEAT_HAFT
KVM: arm64: Add TCR2_EL2 to the sysreg arrays
KVM: arm64: nv: Save/Restore vEL2 sysregs
KVM: arm64: Sanitise TCR2_EL2
KVM: arm64: Correctly access TCR2_EL1, PIR_EL1, PIRE0_EL1 with VHE
KVM: arm64: Add save/restore for TCR2_EL2
KVM: arm64: Extend masking facility to arbitrary registers
KVM: arm64: Add PIR{,E0}_EL2 to the sysreg arrays
KVM: arm64: Add save/restore for PIR{,E0}_EL2
KVM: arm64: Handle PIR{,E0}_EL2 traps
KVM: arm64: Add AT fast-path support for S1PIE
KVM: arm64: Split S1 permission evaluation into direct and hierarchical parts
KVM: arm64: Disable hierarchical permissions when S1PIE is enabled
KVM: arm64: Implement AT S1PIE support
KVM: arm64: Add a composite EL2 visibility helper
KVM: arm64: Rely on visibility to let PIR*_ELx/TCR2_ELx UNDEF
arm64: Add encoding for POR_EL2
KVM: arm64: Drop bogus CPTR_EL2.E0POE trap routing
KVM: arm64: Subject S1PIE/S1POE registers to HCR_EL2.{TVM,TRVM}
KVM: arm64: Add kvm_has_s1poe() helper
KVM: arm64: Add basic support for POR_EL2
KVM: arm64: Add save/restore support for POR_EL2
KVM: arm64: Add POE save/restore for AT emulation fast-path
KVM: arm64: Disable hierarchical permissions when POE is enabled
KVM: arm64: Make PAN conditions part of the S1 walk context
KVM: arm64: Handle stage-1 permission overlays
KVM: arm64: Handle WXN attribute

Mark Brown (3):
KVM: arm64: Define helper for EL2 registers with custom visibility
KVM: arm64: Hide TCR2_EL1 from userspace when disabled for guests
KVM: arm64: Hide S1PIE registers from userspace when disabled for guests

Maxim Levitsky (5):
KVM: x86: drop x86.h include from cpuid.h
KVM: x86: Route non-canonical checks in emulator through emulate_ops
KVM: x86: Add X86EMUL_F_MSR and X86EMUL_F_DT_LOAD to aid canonical checks
KVM: x86: model canonical checks more precisely
KVM: nVMX: fix canonical check of vmcs12 HOST_RIP

Oliver Upton (28):
KVM: arm64: Don't retire aborted MMIO instruction
tools: arm64: Grab a copy of esr.h from kernel
KVM: arm64: selftests: Convert to kernel's ESR terminology
KVM: arm64: selftests: Add tests for MMIO external abort injection
arm64: sysreg: Describe ID_AA64DFR2_EL1 fields
arm64: sysreg: Migrate MDCR_EL2 definition to table
arm64: sysreg: Add new definitions for ID_AA64DFR0_EL1
KVM: arm64: Describe RES0/RES1 bits of MDCR_EL2
KVM: arm64: nv: Allow coarse-grained trap combos to use complex traps
KVM: arm64: nv: Rename BEHAVE_FORWARD_ANY
KVM: arm64: nv: Reinject traps that take effect in Host EL0
KVM: arm64: nv: Honor MDCR_EL2.{TPM, TPMCR} in Host EL0
KVM: arm64: nv: Describe trap behaviour of MDCR_EL2.HPMN
KVM: arm64: nv: Advertise support for FEAT_HPMN0
KVM: arm64: Rename kvm_pmu_valid_counter_mask()
KVM: arm64: nv: Adjust range of accessible PMCs according to HPMN
KVM: arm64: Add helpers to determine if PMC counts at a given EL
KVM: arm64: nv: Honor MDCR_EL2.HPME
KVM: arm64: nv: Honor MDCR_EL2.HLP
KVM: arm64: nv: Apply EL2 event filtering when in hyp context
KVM: arm64: nv: Reprogram PMU events affected by nested transition
Merge branch kvm-arm64/nv-s1pie-s1poe into kvmarm/next
Merge branch kvm-arm64/psci-1.3 into kvmarm/next
Merge branch kvm-arm64/mpam-ni into kvmarm/next
Merge branch kvm-arm64/misc into kvmarm/next
Merge branch kvm-arm64/mmio-sea into kvmarm/next
Merge branch kvm-arm64/nv-pmu into kvmarm/next
Merge branch kvm-arm64/vgic-its-fixes into kvmarm/next

Paolo Bonzini (18):
Merge branch 'kvm-no-struct-page' into HEAD
KVM: powerpc: remove remaining traces of KVM_CAP_PPC_RMA
Documentation: kvm: fix a few mistakes
Documentation: kvm: replace section numbers with links
Documentation: kvm: reorganize introduction
Merge tag 'kvm-riscv-6.13-1' of https://github.com/kvm-riscv/linux into HEAD
Merge tag 'kvm-s390-next-6.13-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
Merge tag 'kvm-x86-generic-6.13' of https://github.com/kvm-x86/linux into HEAD
Merge tag 'kvm-x86-mmu-6.13' of https://github.com/kvm-x86/linux into HEAD
Merge tag 'kvm-x86-selftests-6.13' of https://github.com/kvm-x86/linux into HEAD
Merge tag 'kvm-x86-vmx-6.13' of https://github.com/kvm-x86/linux into HEAD
Merge tag 'kvm-x86-misc-6.13' of https://github.com/kvm-x86/linux into HEAD
Merge branch 'kvm-docs-6.13' into HEAD
Documentation: KVM: fix malformed table
KVM: x86: expose MSR_PLATFORM_INFO as a feature MSR
Merge tag 'kvmarm-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
Merge tag 'loongarch-kvm-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
KVM: x86: switch hugepage recovery thread to vhost_task

Quan Zhou (2):
riscv: perf: add guest vs host distinction
riscv: KVM: add basic support for host vs guest profiling

Raghavendra Rao Ananta (1):
KVM: arm64: Get rid of userspace_irqchip_in_use

Sean Christopherson (142):
KVM: Drop KVM_ERR_PTR_BAD_PAGE and instead return NULL to indicate an error
KVM: Allow calling kvm_release_page_{clean,dirty}() on a NULL page pointer
KVM: Add kvm_release_page_unused() API to put pages that KVM never consumes
KVM: x86/mmu: Skip the "try unsync" path iff the old SPTE was a leaf SPTE
KVM: x86/mmu: Don't overwrite shadow-present MMU SPTEs when prefaulting
KVM: x86/mmu: Invert @can_unsync and renamed to @synchronizing
KVM: x86/mmu: Mark new SPTE as Accessed when synchronizing existing SPTE
KVM: x86/mmu: Mark folio dirty when creating SPTE, not when zapping/modifying
KVM: x86/mmu: Mark page/folio accessed only when zapping leaf SPTEs
KVM: x86/mmu: Use gfn_to_page_many_atomic() when prefetching indirect PTEs
KVM: Rename gfn_to_page_many_atomic() to kvm_prefetch_pages()
KVM: Drop @atomic param from gfn=>pfn and hva=>pfn APIs
KVM: Annotate that all paths in hva_to_pfn() might sleep
KVM: Return ERR_SIGPENDING from hva_to_pfn() if GUP returns -EGAIN
KVM: Drop extra GUP (via check_user_page_hwpoison()) to detect poisoned page
KVM: x86/mmu: Drop kvm_page_fault.hva, i.e. don't track intermediate hva
KVM: Drop unused "hva" pointer from __gfn_to_pfn_memslot()
KVM: Remove pointless sanity check on @map param to kvm_vcpu_(un)map()
KVM: Explicitly initialize all fields at the start of kvm_vcpu_map()
KVM: Use NULL for struct page pointer to indicate mremapped memory
KVM: nVMX: Rely on kvm_vcpu_unmap() to track validity of eVMCS mapping
KVM: nVMX: Drop pointless msr_bitmap_map field from struct nested_vmx
KVM: nVMX: Add helper to put (unmap) vmcs12 pages
KVM: Use plain "struct page" pointer instead of single-entry array
KVM: Provide refcounted page as output field in struct kvm_follow_pfn
KVM: Move kvm_{set,release}_page_{clean,dirty}() helpers up in kvm_main.c
KVM: pfncache: Precisely track refcounted pages
KVM: Pin (as in FOLL_PIN) pages during kvm_vcpu_map()
KVM: nVMX: Mark vmcs12's APIC access page dirty when unmapping
KVM: Pass in write/dirty to kvm_vcpu_map(), not kvm_vcpu_unmap()
KVM: Get writable mapping for __kvm_vcpu_map() only when necessary
KVM: Disallow direct access (w/o mmu_notifier) to unpinned pfn by default
KVM: x86: Don't fault-in APIC access page during initial allocation
KVM: x86/mmu: Add "mmu" prefix fault-in helpers to free up generic names
KVM: x86/mmu: Put direct prefetched pages via kvm_release_page_clean()
KVM: x86/mmu: Add common helper to handle prefetching SPTEs
KVM: x86/mmu: Add helper to "finish" handling a guest page fault
KVM: x86/mmu: Mark pages/folios dirty at the origin of make_spte()
KVM: Move declarations of memslot accessors up in kvm_host.h
KVM: Add kvm_faultin_pfn() to specifically service guest page faults
KVM: x86/mmu: Convert page fault paths to kvm_faultin_pfn()
KVM: guest_memfd: Pass index, not gfn, to __kvm_gmem_get_pfn()
KVM: guest_memfd: Provide "struct page" as output from kvm_gmem_get_pfn()
KVM: x86/mmu: Put refcounted pages instead of blindly releasing pfns
KVM: x86/mmu: Don't mark unused faultin pages as accessed
KVM: Move x86's API to release a faultin page to common KVM
KVM: VMX: Hold mmu_lock until page is released when updating APIC access page
KVM: VMX: Use __kvm_faultin_page() to get APIC access page/pfn
KVM: PPC: e500: Mark "struct page" dirty in kvmppc_e500_shadow_map()
KVM: PPC: e500: Mark "struct page" pfn accessed before dropping mmu_lock
KVM: PPC: e500: Use __kvm_faultin_pfn() to handle page faults
KVM: arm64: Mark "struct page" pfns accessed/dirty before dropping mmu_lock
KVM: arm64: Use __kvm_faultin_pfn() to handle memory aborts
KVM: RISC-V: Mark "struct page" pfns dirty iff a stage-2 PTE is installed
KVM: RISC-V: Mark "struct page" pfns accessed before dropping mmu_lock
KVM: RISC-V: Use kvm_faultin_pfn() when mapping pfns into the guest
KVM: PPC: Use __kvm_faultin_pfn() to handle page faults on Book3s HV
KVM: PPC: Use __kvm_faultin_pfn() to handle page faults on Book3s Radix
KVM: PPC: Drop unused @kvm_ro param from kvmppc_book3s_instantiate_page()
KVM: PPC: Book3S: Mark "struct page" pfns dirty/accessed after installing PTE
KVM: PPC: Use kvm_faultin_pfn() to handle page faults on Book3s PR
KVM: LoongArch: Mark "struct page" pfns dirty only in "slow" page fault path
KVM: LoongArch: Mark "struct page" pfns accessed only in "slow" page fault path
KVM: LoongArch: Mark "struct page" pfn accessed before dropping mmu_lock
KVM: LoongArch: Use kvm_faultin_pfn() to map pfns into the guest
KVM: MIPS: Mark "struct page" pfns dirty only in "slow" page fault path
KVM: MIPS: Mark "struct page" pfns accessed only in "slow" page fault path
KVM: MIPS: Mark "struct page" pfns accessed prior to dropping mmu_lock
KVM: MIPS: Use kvm_faultin_pfn() to map pfns into the guest
KVM: PPC: Remove extra get_page() to fix page refcount leak
KVM: PPC: Use kvm_vcpu_map() to map guest memory to patch dcbz instructions
KVM: Convert gfn_to_page() to use kvm_follow_pfn()
KVM: Add support for read-only usage of gfn_to_page()
KVM: arm64: Use __gfn_to_page() when copying MTE tags to/from userspace
KVM: PPC: Explicitly require struct page memory for Ultravisor sharing
KVM: Drop gfn_to_pfn() APIs now that all users are gone
KVM: s390: Use kvm_release_page_dirty() to unpin "struct page" memory
KVM: Make kvm_follow_pfn.refcounted_page a required field
KVM: x86/mmu: Don't mark "struct page" accessed when zapping SPTEs
KVM: arm64: Don't mark "struct page" accessed when making SPTE young
KVM: Drop APIs that manipulate "struct page" via pfns
KVM: Don't grab reference on VM_MIXEDMAP pfns that have a "struct page"
KVM: Rework core loop of kvm_vcpu_on_spin() to use a single for-loop
KVM: Return '0' directly when there's no task to yield to
KVM: Protect vCPU's "last run PID" with rwlock, not RCU
KVM: x86/mmu: Flush remote TLBs iff MMU-writable flag is cleared from RO SPTE
KVM: x86/mmu: Always set SPTE's dirty bit if it's created as writable
KVM: x86/mmu: Fold all of make_spte()'s writable handling into one if-else
KVM: x86/mmu: Don't force flush if SPTE update clears Accessed bit
KVM: x86/mmu: Don't flush TLBs when clearing Dirty bit in shadow MMU
KVM: x86/mmu: Drop ignored return value from kvm_tdp_mmu_clear_dirty_slot()
KVM: x86/mmu: Fold mmu_spte_update_no_track() into mmu_spte_update()
KVM: x86/mmu: WARN and flush if resolving a TDP MMU fault clears MMU-writable
KVM: x86/mmu: Add a dedicated flag to track if A/D bits are globally enabled
KVM: x86/mmu: Set shadow_accessed_mask for EPT even if A/D bits disabled
KVM: x86/mmu: Set shadow_dirty_mask for EPT even if A/D bits disabled
KVM: x86/mmu: Use Accessed bit even when _hardware_ A/D bits are disabled
KVM: x86/mmu: Process only valid TDP MMU roots when aging a gfn range
KVM: x86/mmu: Stop processing TDP MMU roots for test_age if young SPTE found
KVM: x86/mmu: Dedup logic for detecting TLB flushes on leaf SPTE changes
KVM: x86/mmu: Set Dirty bit for new SPTEs, even if _hardware_ A/D bits are disabled
KVM: Allow arch code to elide TLB flushes when aging a young page
KVM: x86: Don't emit TLB flushes when aging SPTEs for mmu_notifiers
KVM: x86: Ensure vcpu->mode is loaded from memory in kvm_vcpu_exit_request()
KVM: x86: Bypass register cache when querying CPL from kvm_sched_out()
KVM: x86: Add lockdep-guarded asserts on register cache usage
KVM: x86: Use '0' for guest RIP if PMI encounters protected guest state
KVM: x86: Document an erratum in KVM_SET_VCPU_EVENTS on Intel CPUs
KVM: x86: Co-locate initialization of feature MSRs in kvm_arch_vcpu_create()
KVM: x86: Disallow changing MSR_PLATFORM_INFO after vCPU has run
KVM: x86: Quirk initialization of feature MSRs to KVM's max configuration
KVM: x86: Reject userspace attempts to access PERF_CAPABILITIES w/o PDCM
KVM: VMX: Remove restriction that PMU version > 0 for PERF_CAPABILITIES
KVM: x86: Reject userspace attempts to access ARCH_CAPABILITIES w/o support
KVM: x86: Remove ordering check b/w MSR_PLATFORM_INFO and MISC_FEATURES_ENABLES
KVM: selftests: Verify get/set PERF_CAPABILITIES w/o guest PDMC behavior
KVM: selftests: Add a testcase for disabling feature MSRs init quirk
KVM: selftests: Precisely mask off dynamic fields in CPUID test
KVM: selftests: Mask off OSPKE and OSXSAVE when comparing CPUID entries
KVM: selftests: Rework OSXSAVE CR4=>CPUID test to play nice with AVX insns
KVM: selftests: Configure XCR0 to max supported value by default
KVM: selftests: Verify XCR0 can be "downgraded" and "upgraded"
KVM: selftests: Drop manual CR4.OSXSAVE enabling from CR4/CPUID sync test
KVM: selftests: Drop manual XCR0 configuration from AMX test
KVM: selftests: Drop manual XCR0 configuration from state test
KVM: selftests: Drop manual XCR0 configuration from SEV smoke test
KVM: selftests: Ensure KVM supports AVX for SEV-ES VMSA FPU test
KVM: x86/mmu: Check yielded_gfn for forward progress iff resched is needed
KVM: x86/mmu: Demote the WARN on yielded in xxx_cond_resched() to KVM_MMU_WARN_ON
KVM: x86/mmu: Refactor TDP MMU iter need resched check
KVM: x86: Short-circuit all kvm_lapic_set_base() if MSR value isn't changing
KVM: x86: Drop superfluous kvm_lapic_set_base() call when setting APIC state
KVM: x86: Get vcpu->arch.apic_base directly and drop kvm_get_apic_base()
KVM: x86: Inline kvm_get_apic_mode() in lapic.h
KVM: x86: Move kvm_set_apic_base() implementation to lapic.c (from x86.c)
KVM: x86: Rename APIC base setters to better capture their relationship
KVM: x86: Make kvm_recalculate_apic_map() local to lapic.c
KVM: x86: Unpack msr_data structure prior to calling kvm_apic_set_base()
KVM: x86: Short-circuit all of kvm_apic_set_base() if MSR value is unchanged
KVM: selftests: Don't bother deleting memslots in KVM when freeing VMs
Revert "KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()"
KVM: x86: Break CONFIG_KVM_X86's direct dependency on KVM_INTEL || KVM_AMD

Shameer Kolothum (1):
KVM: arm64: Make L1Ip feature in CTR_EL0 writable from userspace

Tao Su (1):
x86: KVM: Advertise CPUIDs for new instructions in Clearwater Forest

Vipin Sharma (2):
KVM: x86/mmu: Remove KVM's MMU shrinker
KVM: x86/mmu: Drop per-VM zapped_obsolete_pages list

Will Deacon (2):
KVM: arm64: Just advertise SEIS as 0 when emulating ICC_CTLR_EL1
KVM: arm64: Don't map 'kvm_vgic_global_state' at EL2 with pKVM

Xianglai Li (11):
LoongArch: KVM: Add iocsr and mmio bus simulation in kernel
LoongArch: KVM: Add IPI device support
LoongArch: KVM: Add IPI read and write function
LoongArch: KVM: Add IPI user mode read and write function
LoongArch: KVM: Add EIOINTC device support
LoongArch: KVM: Add EIOINTC read and write functions
LoongArch: KVM: Add EIOINTC user mode read and write functions
LoongArch: KVM: Add PCHPIC device support
LoongArch: KVM: Add PCHPIC read and write functions
LoongArch: KVM: Add PCHPIC user mode read and write functions
LoongArch: KVM: Add irqfd support

Yan Zhao (1):
KVM: VMX: Remove the unused variable "gpa" in __invept()

Yong-Xuan Wang (1):
RISC-V: KVM: Fix APLIC in_clrip and clripnum write emulation


Documentation/arch/arm64/cpu-feature-registers.rst | 2 +
Documentation/arch/loongarch/irq-chip-model.rst | 64 ++
.../zh_CN/arch/loongarch/irq-chip-model.rst | 55 +
Documentation/virt/kvm/api.rst | 190 ++--
Documentation/virt/kvm/locking.rst | 80 +-
Documentation/virt/kvm/x86/errata.rst | 12 +
arch/arm64/include/asm/cpu.h | 1 +
arch/arm64/include/asm/cpucaps.h | 5 +
arch/arm64/include/asm/cpufeature.h | 17 +
arch/arm64/include/asm/el2_setup.h | 14 +
arch/arm64/include/asm/kvm_arm.h | 30 +-
arch/arm64/include/asm/kvm_asm.h | 1 -
arch/arm64/include/asm/kvm_emulate.h | 9 +
arch/arm64/include/asm/kvm_host.h | 46 +-
arch/arm64/include/asm/kvm_pgtable.h | 4 +-
arch/arm64/include/asm/sysreg.h | 12 -
arch/arm64/include/asm/vncr_mapping.h | 1 -
arch/arm64/include/uapi/asm/kvm.h | 6 +
arch/arm64/kernel/cpufeature.c | 96 ++
arch/arm64/kernel/cpuinfo.c | 3 +
arch/arm64/kvm/arch_timer.c | 3 +-
arch/arm64/kvm/arm.c | 26 +-
arch/arm64/kvm/at.c | 484 +++++++--
arch/arm64/kvm/emulate-nested.c | 301 +++---
arch/arm64/kvm/guest.c | 15 +-
arch/arm64/kvm/hyp/include/hyp/switch.h | 31 +
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 11 +-
arch/arm64/kvm/hyp/include/nvhe/trap_handler.h | 2 -
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 12 +-
arch/arm64/kvm/hyp/nvhe/pkvm.c | 116 ++-
arch/arm64/kvm/hyp/nvhe/psci-relay.c | 2 +
arch/arm64/kvm/hyp/nvhe/setup.c | 20 +-
arch/arm64/kvm/hyp/nvhe/sysreg-sr.c | 2 +-
arch/arm64/kvm/hyp/pgtable.c | 7 +-
arch/arm64/kvm/hyp/vgic-v3-sr.c | 3 -
arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 160 ++-
arch/arm64/kvm/hypercalls.c | 2 +
arch/arm64/kvm/mmio.c | 32 +-
arch/arm64/kvm/mmu.c | 21 +-
arch/arm64/kvm/nested.c | 82 +-
arch/arm64/kvm/pmu-emul.c | 143 ++-
arch/arm64/kvm/psci.c | 44 +-
arch/arm64/kvm/reset.c | 5 -
arch/arm64/kvm/sys_regs.c | 309 ++++--
arch/arm64/kvm/vgic/vgic-its.c | 32 +-
arch/arm64/kvm/vgic/vgic.h | 23 +
arch/arm64/tools/cpucaps | 2 +
arch/arm64/tools/sysreg | 249 ++++-
arch/loongarch/include/asm/irq.h | 1 +
arch/loongarch/include/asm/kvm_eiointc.h | 123 +++
arch/loongarch/include/asm/kvm_host.h | 18 +-
arch/loongarch/include/asm/kvm_ipi.h | 45 +
arch/loongarch/include/asm/kvm_pch_pic.h | 62 ++
arch/loongarch/include/uapi/asm/kvm.h | 20 +
arch/loongarch/kvm/Kconfig | 5 +-
arch/loongarch/kvm/Makefile | 4 +
arch/loongarch/kvm/exit.c | 82 +-
arch/loongarch/kvm/intc/eiointc.c | 1027 ++++++++++++++++++
arch/loongarch/kvm/intc/ipi.c | 475 +++++++++
arch/loongarch/kvm/intc/pch_pic.c | 519 +++++++++
arch/loongarch/kvm/irqfd.c | 89 ++
arch/loongarch/kvm/main.c | 19 +-
arch/loongarch/kvm/mmu.c | 40 +-
arch/loongarch/kvm/vcpu.c | 3 +
arch/loongarch/kvm/vm.c | 21 +
arch/mips/kvm/mmu.c | 26 +-
arch/powerpc/include/asm/kvm_book3s.h | 4 +-
arch/powerpc/kvm/book3s.c | 7 +-
arch/powerpc/kvm/book3s_32_mmu_host.c | 7 +-
arch/powerpc/kvm/book3s_64_mmu_host.c | 12 +-
arch/powerpc/kvm/book3s_64_mmu_hv.c | 25 +-
arch/powerpc/kvm/book3s_64_mmu_radix.c | 35 +-
arch/powerpc/kvm/book3s_hv_nested.c | 4 +-
arch/powerpc/kvm/book3s_hv_uvmem.c | 25 +-
arch/powerpc/kvm/book3s_pr.c | 14 +-
arch/powerpc/kvm/book3s_xive_native.c | 2 +-
arch/powerpc/kvm/e500_mmu_host.c | 19 +-
arch/powerpc/kvm/powerpc.c | 3 -
arch/riscv/include/asm/kvm_host.h | 10 +
arch/riscv/include/asm/kvm_nacl.h | 245 +++++
arch/riscv/include/asm/perf_event.h | 6 +
arch/riscv/include/asm/sbi.h | 120 +++
arch/riscv/kernel/perf_callchain.c | 38 +
arch/riscv/kvm/Kconfig | 1 +
arch/riscv/kvm/Makefile | 27 +-
arch/riscv/kvm/aia.c | 114 +-
arch/riscv/kvm/aia_aplic.c | 3 +-
arch/riscv/kvm/main.c | 63 +-
arch/riscv/kvm/mmu.c | 13 +-
arch/riscv/kvm/nacl.c | 152 +++
arch/riscv/kvm/tlb.c | 57 +-
arch/riscv/kvm/vcpu.c | 191 +++-
arch/riscv/kvm/vcpu_sbi.c | 11 +-
arch/riscv/kvm/vcpu_switch.S | 137 ++-
arch/riscv/kvm/vcpu_timer.c | 28 +-
arch/s390/include/asm/kvm_host.h | 1 +
arch/s390/include/uapi/asm/kvm.h | 3 +-
arch/s390/kvm/kvm-s390.c | 43 +-
arch/s390/kvm/vsie.c | 7 +-
arch/s390/tools/gen_facilities.c | 2 +
arch/x86/include/asm/cpufeatures.h | 3 +
arch/x86/include/asm/kvm-x86-ops.h | 1 +
arch/x86/include/asm/kvm_host.h | 13 +-
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kvm/Kconfig | 6 +-
arch/x86/kvm/cpuid.c | 22 +-
arch/x86/kvm/cpuid.h | 1 -
arch/x86/kvm/emulate.c | 15 +-
arch/x86/kvm/kvm_cache_regs.h | 17 +
arch/x86/kvm/kvm_emulate.h | 5 +
arch/x86/kvm/lapic.c | 51 +-
arch/x86/kvm/lapic.h | 11 +-
arch/x86/kvm/mmu.h | 1 +
arch/x86/kvm/mmu/mmu.c | 444 +++-----
arch/x86/kvm/mmu/mmu_internal.h | 10 +-
arch/x86/kvm/mmu/paging_tmpl.h | 31 +-
arch/x86/kvm/mmu/spte.c | 102 +-
arch/x86/kvm/mmu/spte.h | 78 +-
arch/x86/kvm/mmu/tdp_mmu.c | 280 +++--
arch/x86/kvm/mmu/tdp_mmu.h | 6 +-
arch/x86/kvm/mtrr.c | 1 +
arch/x86/kvm/reverse_cpuid.h | 1 +
arch/x86/kvm/svm/nested.c | 4 +-
arch/x86/kvm/svm/sev.c | 12 +-
arch/x86/kvm/svm/svm.c | 13 +-
arch/x86/kvm/vmx/hyperv.c | 1 +
arch/x86/kvm/vmx/main.c | 1 +
arch/x86/kvm/vmx/nested.c | 77 +-
arch/x86/kvm/vmx/pmu_intel.c | 2 +-
arch/x86/kvm/vmx/sgx.c | 5 +-
arch/x86/kvm/vmx/vmx.c | 131 +--
arch/x86/kvm/vmx/vmx.h | 3 +-
arch/x86/kvm/vmx/vmx_ops.h | 16 +-
arch/x86/kvm/x86.c | 141 ++-
arch/x86/kvm/x86.h | 48 +-
drivers/firmware/psci/psci.c | 45 +
drivers/irqchip/irq-loongson-eiointc.c | 102 +-
include/kvm/arm_arch_timer.h | 3 +
include/kvm/arm_pmu.h | 18 +-
include/kvm/arm_psci.h | 4 +-
include/linux/kvm_host.h | 127 ++-
include/trace/events/kvm.h | 35 +
include/uapi/linux/kvm.h | 8 +
include/uapi/linux/psci.h | 5 +
kernel/power/hibernate.c | 5 +-
tools/arch/arm64/include/asm/brk-imm.h | 42 +
tools/arch/arm64/include/asm/esr.h | 455 ++++++++
tools/arch/s390/include/uapi/asm/kvm.h | 3 +-
tools/testing/selftests/kvm/Makefile | 5 +-
.../selftests/kvm/aarch64/debug-exceptions.c | 10 +-
tools/testing/selftests/kvm/aarch64/mmio_abort.c | 159 +++
tools/testing/selftests/kvm/aarch64/no-vgic-v3.c | 2 +-
.../selftests/kvm/aarch64/page_fault_test.c | 4 +-
tools/testing/selftests/kvm/aarch64/psci_test.c | 92 ++
tools/testing/selftests/kvm/aarch64/set_id_regs.c | 99 +-
.../selftests/kvm/aarch64/vpmu_counter_access.c | 12 +-
.../testing/selftests/kvm/hardware_disable_test.c | 1 -
.../selftests/kvm/include/aarch64/processor.h | 15 +-
.../testing/selftests/kvm/include/s390x/facility.h | 50 +
.../selftests/kvm/include/s390x/processor.h | 6 +
.../selftests/kvm/include/x86_64/processor.h | 5 +
.../testing/selftests/kvm/lib/aarch64/processor.c | 6 +-
tools/testing/selftests/kvm/lib/kvm_util.c | 10 +-
tools/testing/selftests/kvm/lib/s390x/facility.c | 14 +
tools/testing/selftests/kvm/lib/x86_64/processor.c | 24 +
.../selftests/kvm/s390x/cpumodel_subfuncs_test.c | 301 ++++++
tools/testing/selftests/kvm/s390x/ucontrol_test.c | 322 +++++-
tools/testing/selftests/kvm/x86_64/amx_test.c | 23 +-
tools/testing/selftests/kvm/x86_64/cpuid_test.c | 67 +-
.../selftests/kvm/x86_64/cr4_cpuid_sync_test.c | 53 +-
tools/testing/selftests/kvm/x86_64/debug_regs.c | 2 +-
.../selftests/kvm/x86_64/feature_msrs_test.c | 113 ++
.../selftests/kvm/x86_64/get_msr_index_features.c | 35 -
.../selftests/kvm/x86_64/platform_info_test.c | 2 -
.../testing/selftests/kvm/x86_64/sev_smoke_test.c | 19 +-
tools/testing/selftests/kvm/x86_64/state_test.c | 5 -
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 23 +
.../testing/selftests/kvm/x86_64/xcr0_cpuid_test.c | 11 +-
virt/kvm/Kconfig | 4 +
virt/kvm/guest_memfd.c | 28 +-
virt/kvm/kvm_main.c | 1098 ++++++++------------
virt/kvm/kvm_mm.h | 36 +-
virt/kvm/pfncache.c | 20 +-
183 files changed, 9216 insertions(+), 2655 deletions(-)