Re: [RFC 1/4] perf kvm: Enable 'record' on powerpc

From: Arnaldo Carvalho de Melo
Date: Tue Mar 22 2016 - 22:19:35 EST


Em Tue, Mar 22, 2016 at 04:12:11PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Feb 24, 2016 at 02:37:42PM +0530, Ravi Bangoria escreveu:
> > 'perf kvm record' is not available on powerpc because 'perf' relies on
> > the 'cycles' event (a PMU event) to profile the guest. However, for
> > powerpc, this can't be used from the host because the PMUs are controlled
> > by the guest rather than the host.
> >
> > There exists a tracepoint 'kvm_hv:kvm_guest_exit' in powerpc which is
> > hit whenever any of the threads exit the guest context. The guest
> > instruction pointer dumped along with this tracepoint data in the field
> > 'pc', can be used as guest instruction pointer.
> >
> > This patch changes default event as kvm_hv:kvm_guest_exit for recording
> > guest data in host on powerpc. As we are using host event to record guest
> > data, this approach will enable only --guest option of 'perf kvm'. Still
> > --host --guest together won't work.
>
> It should, i.e. --host --guest should translate to:
>
> -e cycles:H,kvm_hv:kvm_guest_exit
>
> I.e. both collect cycles only in the host, and also the tracepoint that
> will allow us to get the guest approximation for the unavailable cycles
> event, no?
>
> I'm putting the infrastructure work needed for this the perf/cpumode
> branch. More work will be put there soon.

So I took a different path and made perf_evsel__parse_sample set a new
perf_sample.cpumode field, this way we'll end up having just to set a
per-evsel ->post_parse_sample() callback for the event that replaces
"cycles" for PPC guests where we'll just set data->ip and data->cpumode,
the rest of the code remains unchanged.

The changes I made looks useful in itself, as, IIRC more code was
removed than added.

I'll continue tomorrow and will test with the kvm:kvm_exit on x86_64 for
testing, that has:

[root@jouet ~]# perf kvm --guest record -a -e cycles:H,kvm:kvm_exit
sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.410 MB perf.data.guest (16076
samples) ]
[root@jouet ~]# perf evlist -i perf.data.guest --trace-fields
cycles:H: (not a tracepoint)
kvm:kvm_exit: trace_fields: exit_reason,guest_rip,isa,info1,info2
[root@jouet ~]#

Enough for me to test this code without requiring access to a PPC64
machine with a guest.

The first approach, more in line with your latest patchkit is at
perf/cpumode.v1, the one I'm working now is at the perf/cpumode branch
in my git tree at:

https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/cpumode

- Arnaldo

> I didn't like the fact that this was touching the common code too much,
> so I'm providing infrastructure for the evsel to be used to obtain the
> cpumode, but trying to constrain the kvm specific bits.
>
> I.e. we should not touch perf_evlist__add_default, but instead replace
> "cycles" with what is appropriate for 'perf kvm record' when that event
> is specified, be it explicitely or not.
>
> Doing it this way I _think_ we'll be able to remove the limitation
> stated on your last paragraph above ("Still --host --guest together
> won't work").
>
> I.e. this, but automatically:
>
> [root@jouet ~]# perf kvm record -a -e cycles:H,kvm:kvm_exit
> ^C[ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 1.428 MB perf.data.guest (1644
> samples) ]
>
> [root@jouet ~]#
> [root@jouet ~]# perf evlist -i perf.data.guest
> cycles:H
> kvm:kvm_exit
> # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint
> # events
> [root@jouet ~]#
>
> If we look at the verbose output:
>
> [root@jouet ~]# perf evlist -v -i perf.data.guest
> cycles:H: size: 112, { sample_period, sample_freq }: 4000, sample_type:
> IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID, disabled: 1,
> inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
> exclude_guest: 1, mmap2: 1, comm_exec: 1
> kvm:kvm_exit: type: 2, size: 112, config: 0x596, { sample_period,
> sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER,
> read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1,
> exclude_host: 1
> # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint
> # events
> [root@jouet ~]#
>
> ----------------------------------------------------------------------------
>
> Which works as well with:
>
> [root@jouet ~]# perf kvm --guest --host record -a -e cycles:H,kvm:kvm_exit
> ^C[ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 1.387 MB perf.data.kvm (1090 samples)
> ]
>
> [root@jouet ~]# perf evlist -v -i perf.data.kvm
> cycles:H: size: 112, { sample_period, sample_freq }: 4000, sample_type:
> IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID, disabled: 1,
> inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
> exclude_guest: 1, mmap2: 1, comm_exec: 1
> kvm:kvm_exit: type: 2, size: 112, config: 0x596, { sample_period,
> sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER,
> read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1
> # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint
> # events
> [root@jouet ~]#
>
> ----------------------------------------------------------------------------
>
> Comments?
>
> - Arnaldo