On Mon, Apr 22, 2024 at 11:45 PM Mi, Dapeng <dapeng1.mi@xxxxxxxxxxxxxxx> wrote:
Is there any event in the host still having PERF_EVENT_STATE_ACTIVE?
On 4/23/2024 2:08 PM, maobibo wrote:
Just go through functions (not sure if all), whether
On 2024/4/23 下午12:23, Mingwei Zhang wrote:
On Mon, Apr 22, 2024 at 8:55 PM maobibo <maobibo@xxxxxxxxxxx> wrote:perf core will access pmu hw in tick timer/hrtimer/ipi function call,
This is correct. And this is one of the points that we had debated
On 2024/4/23 上午11:13, Mi, Dapeng wrote:
On 4/23/2024 10:53 AM, maobibo wrote:yes, it is.
That's one issue this patchset tries to solve. Current new mediated
On 2024/4/23 上午10:44, Mi, Dapeng wrote:
On 4/23/2024 9:01 AM, maobibo wrote:yes, if PMU HW counter/control is granted to VM. The value comes from
PMU event is a software entity which won't be shared. did you
On 2024/4/23 上午1:01, Sean Christopherson wrote:
On Mon, Apr 22, 2024, maobibo wrote:Hi Sean,
On 2024/4/16 上午6:45, Sean Christopherson wrote:Two questions:
On Mon, Apr 15, 2024, Mingwei Zhang wrote:There is similar issue on LoongArch vPMU where vm can directly
On Mon, Apr 15, 2024 at 10:38 AM Sean ChristophersonI don't think this is a statement we want to make, as it opens a
<seanjc@xxxxxxxxxx> wrote:
One my biggest complaints with the current vPMU code is thatSo once this rolls out, perf and vPMU are clients directly to
the roles and
responsibilities between KVM and perf are poorly defined,
which
leads to suboptimal
and hard to maintain code.
Case in point, I'm pretty sure leaving guest values in PMCs
_would_ leak guest
state to userspace processes that have RDPMC permissions, as
the PMCs might not
be dirty from perf's perspective (see
perf_clear_dirty_counters()).
Blindly clearing PMCs in KVM "solves" that problem, but in
doing so makes the
overall code brittle because it's not clear whether KVM
_needs_
to clear PMCs,
or if KVM is just being paranoid.
PMU HW.
discussion
that we won't win. Nor do I think it's one we *need* to make.
KVM doesn't need
to be on equal footing with perf in terms of owning/managing PMU
hardware, KVM
just needs a few APIs to allow faithfully and accurately
virtualizing a guest PMU.
Faithful cleaning (blind cleaning) has to be the baselineWhat I am saying is that there needs to be a "deal" in place
implementation, until both clients agree to a "deal" between
them.
Currently, there is no such deal, but I believe we could have
one via
future discussion.
before this code
is merged. It doesn't need to be anything fancy, e.g. perf can
still pave over
PMCs it doesn't immediately load, as opposed to using
cpu_hw_events.dirty to lazily
do the clearing. But perf and KVM need to work together from
the
get go, ie. I
don't want KVM doing something without regard to what perf does,
and vice versa.
pmu
hardware
and pmu hw is shard with guest and host. Besides context switch
there are
other places where perf core will access pmu hw, such as tick
timer/hrtimer/ipi function call, and KVM can only intercept
context switch.
1) Can KVM prevent the guest from accessing the PMU?
2) If so, KVM can grant partial access to the PMU, or is it all
or nothing?
If the answer to both questions is "yes", then it sounds like
LoongArch *requires*
mediated/passthrough support in order to virtualize its PMU.
Thank for your quick response.
yes, kvm can prevent guest from accessing the PMU and grant partial
or all to access to the PMU. Only that if one pmu event is granted
to VM, host can not access this pmu event again. There must be pmu
event switch if host want to.
mean if
a PMU HW counter is granted to VM, then Host can't access the PMU HW
counter, right?
guest, and is not meaningful for host. Host pmu core does not know
that it is granted to VM, host still think that it owns pmu.
x86
vPMU framework doesn't allow Host or Guest own the PMU HW resource
simultaneously. Only when there is no !exclude_guest event on host,
guest is allowed to exclusively own the PMU HW resource.
Just like FPU register, it is shared by VM and host during differentI didn't fully get your point. When IPI or timer interrupt reach, a
time and it is lately switched. But if IPI or timer interrupt uses
FPU
register on host, there will be the same issue.
VM-exit is triggered to make CPU traps into host first and then the
host
internally whether we should do PMU context switch at vcpu loop
boundary or VM Enter/exit boundary. (host-level) timer interrupt can
force VM Exit, which I think happens every 4ms or 1ms, depending on
configuration.
One of the key reasons we currently propose this is because it is the
same boundary as the legacy PMU, i.e., it would be simple to propose
from the perf subsystem perspective.
Performance wise, doing PMU context switch at vcpu boundary would be
way better in general. But the downside is that perf sub-system lose
the capability to profile majority of the KVM code (functions) when
guest PMU is enabled.
Oh, the IPI/timer irq handler will access PMU registers? I thoughtinterrupt handler is called. Or are you complaining the executingIn our vPMU implementation, it is ok if vPMU is switched in vm exit
sequence of switching guest PMU MSRs and these interrupt handler?
path, however there is problem if vPMU is switched during vcpu thread
sched-out/sched-in path since IPI/timer irq interrupt access pmu
register in host mode.
only the host-level NMI handler will access the PMU MSRs since PMI is
registered under NMI.
In that case, you should disable IRQ during vcpu context switch. For
NMI, we prevent its handler from accessing the PMU registers. In
particular, we use a per-cpu variable to guard that. So, the
host-level PMI handler for perf sub-system will check the variable
before proceeding.
such as function perf_event_task_tick() is called in tick timer, there
are event_function_call(event, __perf_event_xxx, &value) in file
kernel/events/core.c.
https://lore.kernel.org/lkml/20240417065236.500011-1-gaosong@xxxxxxxxxxx/T/#m15aeb79fdc9ce72dd5b374edd6acdcf7a9dafcf4
perf_event_task_tick() or the callbacks of event_function_call() would
check the event->state first, if the event is in
PERF_EVENT_STATE_INACTIVE, the PMU HW MSRs would not be touched really.
In this new proposal, all host events with exclude_guest attribute would
be put on PERF_EVENT_STATE_INACTIVE sate if guest own the PMU HW
resource. So I think it's fine.
If so, hmm, it will reach perf_pmu_disable(event->pmu), which will
access the global ctrl MSR.