On Mon, Jan 10, 2022 at 6:11 PM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
On 11/1/2022 2:13 am, Jim Mattson wrote:
On Sun, Jan 9, 2022 at 10:23 PM Like Xu <like.xu.linux@xxxxxxxxx> wrote:
On 9/1/2022 9:23 am, Jim Mattson wrote:
On Fri, Dec 10, 2021 at 7:48 PM Jim Mattson <jmattson@xxxxxxxxxx> wrote:
On Fri, Dec 10, 2021 at 6:15 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
On 12/10/21 20:25, Jim Mattson wrote:
In the long run, I'd like to be able to override this system-wide
setting on a per-VM basis, for VMs that I trust. (Of course, this
implies that I trust the userspace process as well.)
How would you feel if we were to add a kvm ioctl to override this
setting, for a particular VM, guarded by an appropriate permissions
check, like capable(CAP_SYS_ADMIN) or capable(CAP_SYS_MODULE)?
What's the rationale for guarding this with a capability check? IIRC
you don't have such checks for perf_event_open (apart for getting kernel
addresses, which is not a problem for virtualization).
My reasoning was simply that for userspace to override a mode 0444
kernel module parameter, it should have the rights to reload the
module with the parameter override. I wasn't thinking specifically
about PMU capabilities.
Do we have a precedent on any module parameter rewriting for privileger ?
A further requirement is whether we can dynamically change this part of
the behaviour when the guest is already booted up.
Assuming that we trust userspace to decide whether or not to expose a
virtual PMU to a guest (as we do on the Intel side), perhaps we could
make use of the existing PMU_EVENT_FILTER to give us per-VM control,
rather than adding a new module parameter for per-host control. If
Various granularities of control are required to support vPMU production
scenarios, including per-host, per-VM, and dynamic-guest-alive control.
userspace calls KVM_SET_PMU_EVENT_FILTER with an action of
KVM_PMU_EVENT_ALLOW and an empty list of allowed events, KVM could
just disable the virtual PMU for that VM.
AMD will also have "CPUID Fn8000_0022_EBX[NumCorePmc, 3:0]".
Where do you see this? Revision 3.33 (November 2021) of the AMD APM,
volume 3, only goes as high as CPUID Fn8000_0021.
Try APM Revision: 4.04 (November 2021), page 1849/3273,
"CPUID Fn8000_0022_EBX Extended Performance Monitoring and Debug".
Is this a public document?
Given the current ambiguity in this revision, the AMD folks will reveal more
details bout this field in the next revision.
Today, the semantics of an empty allow list are quite different from
the proposed pmuv module parameter being false. However, it should be
an easy conversion. Would anyone be concerned about changing the
current semantics of an empty allow list? Is there a need for
disabling PMU virtualization for legacy userspace implementations that
can't be modified to ask for an empty allow list?
AFAI, at least one user-space agent has integrated with it plus additional
"action"s.
Once the API that the kernel presents to user space has been defined,
it's best not to change it and instead fall into remorse.
Okay.
I propose the following:
1) The new module parameter should apply to Intel as well as AMD, for
situations where userspace is not trusted.
2) If the module parameter allows PMU virtualization, there should be
a new KVM_CAP whereby userspace can enable/disable PMU virtualization.
(Since you require a dynamic toggle, and there is a move afoot to
refuse guest CPUID changes once a guest is running, this new KVM_CAP
is needed on Intel as well as AMD).
Both hands in favour. Do you need me as a labourer, or you have a ready-made one ?
We could split the work. Since (1) is a modification of the change you
proposed in this thread, perhaps you could apply it to both AMD and
Intel in v2? We can find someone for (2).
3) If the module parameter does not allow PMU virtualization, there
should be no userspace override, since we have no precedent for
authorizing that kind of override.
Uh, I thought you (Google) had a lot of these (interesting) use cases internally.
We have modified some module parameters so that they can be changed at
runtime, but we don't have any concept of a privileged userspace
overriding a module parameter restriction.
"But I am not a decision maker. " :D
Thanks,
Like Xu