Re: [PATCH 10/10] perf/doc: update design.txt for exclude_{host|guest} flags

From: Michael Ellerman
Date: Tue Dec 11 2018 - 23:48:38 EST


Andrew Murray <andrew.murray@xxxxxxx> writes:
> On Tue, Dec 11, 2018 at 10:06:53PM +1100, Michael Ellerman wrote:
>> [ Reviving old thread. ]
>>
>> Andrew Murray <andrew.murray@xxxxxxx> writes:
>> > On Tue, Nov 20, 2018 at 10:31:36PM +1100, Michael Ellerman wrote:
>> >> Andrew Murray <andrew.murray@xxxxxxx> writes:
>> >>
>> >> > Update design.txt to reflect the presence of the exclude_host
>> >> > and exclude_guest perf flags.
>> >> >
>> >> > Signed-off-by: Andrew Murray <andrew.murray@xxxxxxx>
>> >> > ---
>> >> > tools/perf/design.txt | 4 ++++
>> >> > 1 file changed, 4 insertions(+)
>> >> >
>> >> > diff --git a/tools/perf/design.txt b/tools/perf/design.txt
>> >> > index a28dca2..7de7d83 100644
>> >> > --- a/tools/perf/design.txt
>> >> > +++ b/tools/perf/design.txt
>> >> > @@ -222,6 +222,10 @@ The 'exclude_user', 'exclude_kernel' and 'exclude_hv' bits provide a
>> >> > way to request that counting of events be restricted to times when the
>> >> > CPU is in user, kernel and/or hypervisor mode.
>> >> >
>> >> > +Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way
>> >> > +to request counting of events restricted to guest and host contexts when
>> >> > +using virtualisation.
>> >>
>> >> How does exclude_host differ from exclude_hv ?
>> >
>> > I believe exclude_host / exclude_guest are intented to distinguish
>> > between host and guest in the hosted hypervisor context (KVM).
>>
>> OK yeah, from the perf-list man page:
>>
>> u - user-space counting
>> k - kernel counting
>> h - hypervisor counting
>> I - non idle counting
>> G - guest counting (in KVM guests)
>> H - host counting (not in KVM guests)
>>
>> > Whereas exclude_hv allows to distinguish between guest and
>> > hypervisor in the bare-metal type hypervisors.
>>
>> Except that's exactly not how we use them on powerpc :)
>>
>> We use exclude_hv to exclude "the hypervisor", regardless of whether
>> it's KVM or PowerVM (which is a bare-metal hypervisor).
>>
>> We don't use exclude_host / exclude_guest at all, which I guess is a
>> bug, except I didn't know they existed until this thread.
>>
>> eg, in a KVM guest:
>>
>> $ perf record -e cycles:G /bin/bash -c "for i in {0..100000}; do :;done"
>> $ perf report -D | grep -Fc "dso: [hypervisor]"
>> 16
>>
>>
>> > In the case of arm64 - if VHE extensions are present then the host
>> > kernel will run at a higher privilege to the guest kernel, in which
>> > case there is no distinction between hypervisor and host so we ignore
>> > exclude_hv. But where VHE extensions are not present then the host
>> > kernel runs at the same privilege level as the guest and we use a
>> > higher privilege level to switch between them - in this case we can
>> > use exclude_hv to discount that hypervisor role of switching between
>> > guests.
>>
>> I couldn't find any arm64 perf code using exclude_host/guest at all?
>
> Correct - but this is in flight as I am currently adding support for this
> see [1].

OK, so at least that will be consistent across arm64 & x86.

>> And I don't see any x86 code using exclude_hv.
>
> I can't find any either.

I think that's because they don't need it, because they don't let guests
program the PMU directly. It's all handled by the host and the host
doesn't let the guest count host cycles anyway. But I could be wrong I'm
no x86 expert.

>> But maybe that's OK, I just worry this is confusing for users.
>
> There is some extra context regarding this where exclude_guest/exclude_host
> was added, see [2]

Good find. I had looked at that commit, but the thread on the list is
more informative.

In fact there was even a man page update! Never occurred to me look
there :P

http://man7.org/linux/man-pages/man2/perf_event_open.2.html

exclude_host (since Linux 3.2)
When conducting measurements that include processes running VM
instances (i.e., have executed a KVM_RUN ioctl(2)), only meaâ
sure events happening inside a guest instance. This is only
meaningful outside the guests; this setting does not change
counts gathered inside of a guest. Currently, this functionâ
ality is x86 only.

exclude_guest (since Linux 3.2)
When conducting measurements that include processes running VM
instances (i.e., have executed a KVM_RUN ioctl(2)), do not
measure events happening inside guest instances. This is only
meaningful outside the guests; this setting does not change
counts gathered inside of a guest. Currently, this functionâ
ality is x86 only.


Which makes things much clearer.

Perhaps you want to add a reference to the man page in your text,
something like?

Furthermore the 'exclude_host' and 'exclude_guest' bits provide a way
to request counting of events restricted to guest and host contexts when
using virtualisation. See the perf_event_open(2) man page for more
detail.


cheers