[RFC PATCH 0/2] KVM: x86: Relay a nested Hyper-V root's vmbus posts to L0
From: Robert Nowotny
Date: Wed Jun 17 2026 - 11:12:25 EST
This RFC asks for direction on a small KVM/x86 addition before
adding a selftest
and an SVM counterpart. It lets a nested Hyper-V root partition's
vmbus come up
when the L1 hypervisor runs under KVM with a userspace VMM that
owns the host
vmbus endpoint.
Patch 1 renames nested_evmcs_l2_tlb_flush_enabled() to
nested_evmcs_l2_direct_hypercall_enabled(), since the predicate is
really "L1
granted this L2 the eVMCS direct-hypercall facility" and a second
caller now
shares it. No functional change.
Patch 2 adds the relay.
The userspace user is OpenVMM
(https://github.com/microsoft/openvmm); the
companion change that enables this capability with the bitmask
will be posted to
OpenVMM later.
Problem
-------
A Windows guest that enables Hyper-V/VBS runs its own kernel as
the root
partition of a nested hypervisor, i.e. as an L2 guest: guest
kernel ->
nested hypervisor (L1) -> KVM (L0). The root's vmbus never
connects. Its
HvPostMessage(InitiateContact) is an L2 VMCALL that exits to L0
and is
reflected up to L1, which has no path to forward it to the
userspace VMM. The
guest bugchecks 0x7B early in boot.
What the patch does
-------------------
Add a per-VM capability whose argument is a bitmask of the nested
Hyper-V
hypercall classes userspace wants kept in L0 (HvPostMessage,
HvSignalEvent).
For a selected class, and when L1 has authorized the L2 for direct
nested
hypercalls (nested_evmcs_l2_direct_hypercall_enabled(), the gate
KVM already
honors for the L2 TLB-flush hypercall), the L2 VMCALL is handled
in L0 instead
of reflected to L1: KVM clears the nested bit, translates the L2
GPA in the
input parameter to an L1 GPA via the nested MMU, and lets the
existing
hypercall path deliver the post to userspace via KVM_EXIT_HYPERV,
exactly as
for a non-nested guest.
Why this belongs in the kernel
------------------------------
The message handling already lives in userspace and does not move:
a non-nested
HvPostMessage exits to userspace today via KVM_EXIT_HYPERV, and
the relayed
nested post takes the same exit. Only two steps cannot be done in
userspace with
the current uAPI, and both are kernel-only primitives:
1. Suppressing nested exit reflection. The "keep this L2 VMCALL
in L0 instead
of reflecting to L1" decision is made in
nested_vmx_reflect_vmexit(); KVM
does not exit to userspace on a nested L2 VM-exit before
deciding
reflection, and adding such an exit would be a much broader
and riskier
ABI. A nested exit also cannot be cleanly reflected to L1
after a userspace
round-trip, which is why the decision stays in the kernel.
2. Translating the L2 GPA to an L1 GPA, which needs the nested
MMU / shadow
EPT that userspace cannot walk.
The relayable set is a userspace-supplied bitmask
-------------------------------------------------
args[0] selects which nested Hyper-V hypercall classes to keep in
L0. The
in-kernel decision stays in the kernel, the choice of which calls
to relay is
userspace's, and the kernel carries no vmbus-specific policy. New
relayable
nested hypercalls can be added without another kernel change.
Scope and limitations
---------------------
- VMX-only; no SVM counterpart yet.
- The capability number 249 is a placeholder pending assignment.
- No selftest yet (this is an RFC for direction). A selftest
and, if the
relay stays, an SVM path would come with the non-RFC series.
Tooling transparency
--------------------
This work was developed with AI assistance (Claude,
claude-opus-4-8), reflected
in each patch's Assisted-by tag. The assistant analyzed the
nested-exit
reflection and Hyper-V hypercall paths, drafted the comments and
changelogs, and
cross-checked the behavior against the TLFS and the existing L2
TLB-flush
handling. The mechanism was derived from runtime analysis of a
stock Windows
guest that bugchecks 0x7B without the relay and boots with it. The
submitter has
reviewed the change in full and takes responsibility for it.
Testing
-------
The relay mechanism was validated on a Proxmox VE 7.0.2 kernel
(the same logic,
applied to that tree): a stock nested Windows guest under a
userspace VMM that
owns the host vmbus endpoint fails to bring up its root vmbus
(0x7B) without the
capability and boots to the full desktop with it. checkpatch is
clean on both
patches. A mainline KVM_INTEL=m KVM_AMD=m KVM_WERROR=y build and a
KVM selftest
are still to come with the non-RFC series.
Yours
sincerely

Ing. Robert Nowotny
CTO, Executive Technical
Director

------------------------------------------------------------------
Company Information :
Rotek Handels GmbH
Handelsstrasse 4
A-2201 Hagenbrunn
Austria
Tel : +43-2246-20791-23
Fax : +43-2246-20791-50
Executive Director: Robert Rernböck
Registered under : FN271982z, Landesgericht Korneuburg
VAT Number : ATU62139135
------------------------------------------------------------------
CONTACT:
mailto: rnowotny@xxxxxxxx
Web: https://www.rotek.at
------------------------------------------------------------------
Attachment:
smime.p7s
Description: Kryptografische S/MIME-Signatur