Re: [PATCH v1 01/11] perf/x86/core: Support KVM to assign a dedicated counter for guest PEBS

From: Like Xu
Date: Wed Aug 19 2020 - 23:32:46 EST


Hi Peter,

On 2020/6/12 13:28, Kang, Luwei wrote:
Suppose your KVM thing claims counter 0/2 (ICL/SKL) for some random
PEBS event, and then the host wants to use PREC_DIST.. Then one of
them will be screwed for no reason what so ever.


The multiplexing should be triggered.

For host, if both user A and user B requires PREC_DIST, the
multiplexing should be triggered for them.
Now, the user B is KVM. I don't think there is difference. The
multiplexing should still be triggered. Why it is screwed?

Becuase if KVM isn't PREC_DIST we should be able to reschedule it to a
different counter.

How is that not destroying scheduling freedom? Any other situation
we'd have moved the !PREC_DIST PEBS event to another counter.


All counters are equivalent for them. It doesn't matter if we move it
to another counter. There is no impact for the user.

But we cannot move it to another counter, because you're pinning it.

Hi Peter,

To avoid the pinning counters, I have tried to do some evaluation about
patching the PEBS record for guest in KVM. In this approach, about ~30%
time increased on guest PEBS PMI handler latency (
e.g.perf record -e branch-loads:p -c 1000 ~/Tools/br_instr a).

Some implementation details as below:
1. Patching the guest PEBS records "Applicable Counters" filed when the guest
required counter is not the same with the host. Because the guest PEBS
driver will drop these PEBS records if the "Applicable Counters" not the
same with the required counter index.
2. Traping the guest driver's behavior(VM-exit) of disabling PEBS.
It happens before reading PEBS records (e.g. PEBS PMI handler, before
application exit and so on)
3. To patch the Guest PEBS records in KVM, we need to get the HPA of the
guest PEBS buffer.
<1> Trapping the guest write of IA32_DS_AREA register and get the GVA
of guest DS_AREA.
<2> Translate the DS AREA GVA to GPA(kvm_mmu_gva_to_gpa_read)
and get the GVA of guest PEBS buffer from DS AREA
(kvm_vcpu_read_guest_atomic).
<3> Although we have got the GVA of PEBS buffer, we need to do the
address translation(GVA->GPA->HPA) for each page. Because we can't
assume the GPAs of Guest PEBS buffer are always continuous.

But we met another issue about the PEBS counter reset field in DS AREA.
pebs_event_reset in DS area has to be set for auto reload, which is per
counter. Guest and Host may use different counters. Let's say guest wants to
use counter 0, but host assign counter 1 to guest. Guest sets the reset value to
pebs_event_reset[0]. However, since counter 1 is the one which is eventually
scheduled, HW will use pebs_event_reset[1] as reset value.

We can't copy the value of the guest pebs_event_reset[0] to
pebs_event_reset[1] directly(Patching DS AREA) because the guest driver may
confused, and we can't assume the guest counter 0 and 1 are not used for this
PEBS task at the same time. And what's more, KVM can't aware the guest
read/write to the DS AREA because it just a general memory for guest.

What is your opinion or do you have a better proposal?

Do we have any update or clear attitude
on this "patching the PEBS record for guest in KVM" proposal ?

Thanks,
Like Xu


Thanks,
Luwei Kang


In the new proposal, KVM user is treated the same as other host events
with event constraint. The scheduler is free to choose whether or not
to assign a counter for it.

That's what it does, I understand that. I'm saying that that is creating artificial
contention.


Why is this needed anyway? Can't we force the guest to flush and then move it
over to a new counter?