Re: [PATCH RFC] KVM: arm64: PMU: Use multiple host PMUs

From: Marc Zyngier
Date: Wed Mar 19 2025 - 14:39:03 EST


On Wed, 19 Mar 2025 11:51:21 +0000,
Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote:
>
> On 2025/03/19 20:41, Marc Zyngier wrote:
> > On Wed, 19 Mar 2025 11:26:18 +0000,
> > Akihiko Odaki <akihiko.odaki@xxxxxxxxxx> wrote:
> >>
> >> On 2025/03/19 20:07, Marc Zyngier wrote:
> >>> On Wed, 19 Mar 2025 10:26:57 +0000,
> >>>>
> >>> But that'd be a new ABI, which again would require buy-in from
> >>> userspace. Maybe there is scope for an all CPUs, cycle-counter only
> >>> PMUv3 exposed to the guest, but that cannot be set automatically, as
> >>> we would otherwise regress existing setups.
> >>>
> >>> At this stage, and given that you need to change userspace, I'm not
> >>> sure what the best course of action is.
> >>
> >> Having an explicit flag for the userspace is fine for QEMU, which I
> >> care. It can flip the flag if and only if threads are not pinned to
> >> one PMU and the machine is a new setup.
> >>
> >> I also wonder what regression you think setting it automatically causes.
> >
> > The current behaviour is that if you don't specify anything other than
> > creating a PMUv3 (without KVM_ARM_VCPU_PMU_V3_SET_PMU), you get *some*
> > PMU, and userspace is responsible for running the vcpu on CPUs that
> > will implement that PMU. When if does, all the counters, all the
> > events are valid. If it doesn't, nothing counts, but the
> > counters/events are still valid.
> >
> > If you now add this flag automatically, the guest doesn't see the full
> > PMU anymore. Only the cycle counter. That's the regression.
>
> What about setting the flag automatically when a user fails to pin
> vCPUs to CPUs that are covered by one PMU? There would be no change if
> a user correctly pins vCPUs as it is. Otherwise, they will see a
> correct feature set advertised to the guest and the cycle counter
> working.

How do you know that the affinity is "correct"? VCPU affinity can be
changed at any time. I, for one, do not want my VMs to change
behaviour because I let the vcpus bounce around as the scheduler sees
fit.

Honestly, this is not a can of worm I want to open. We already have a
pretty terrible userspace API for the PMU, let's not add to the
confusion. *If* we are going down the road of presenting a dumbed-down
PMU to the guest, it has to be an explicit buy-in from userspace.

M.

--
Without deviation from the norm, progress is not possible.