Re: [perf metricgroup] fcc9c5243c: perf-sanity-tests.Parse_and_process_metrics.fail

From: Ian Rogers
Date: Tue Oct 20 2020 - 12:53:57 EST


On Tue, Oct 20, 2020 at 1:56 AM kajoljain <kjain@xxxxxxxxxxxxx> wrote:
>
>
>
> On 10/19/20 9:50 PM, Ian Rogers wrote:
> > On Mon, Oct 19, 2020 at 2:51 AM John Garry <john.garry@xxxxxxxxxx> wrote:
> >>
> >> On 19/10/2020 00:30, Ian Rogers wrote:
> >>> On Sun, Oct 18, 2020 at 1:51 AM kernel test robot <rong.a.chen@xxxxxxxxx> wrote:
> >>>>
> >>>> Greeting,
> >>>>
> >>>> FYI, we noticed the following commit (built with gcc-9):
> >>>>
> >>>> commit: fcc9c5243c478f104014daf4d23db86098d2aef0 ("perf metricgroup: Hack a fix for aliases when covering multiple PMUs")
> >>>> url: https://github.com/0day-ci/linux/commits/John-Garry/perf-pmu-events-Support-event-aliasing-for-system-PMUs/20201008-182049
> >>>>
> >>>>
> >>>> in testcase: perf-sanity-tests
> >>>> version: perf-x86_64-c85fb28b6f99-1_20201008
> >>>> with following parameters:
> >>>>
> >>>> perf_compiler: gcc
> >>>> ucode: 0xdc
> >>>>
> >>>>
> >>>>
> >>>> on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 32G memory
> >>>>
> >>>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> >>>
> >>> I believe this is a Skylake and there is a known bug in the Skylake
> >>> metric DRAM_Parallel_Reads as described here:
> >>> https://lore.kernel.org/lkml/CAP-5=fXejVaQa9qfW66cY77qB962+jbe8tT5bsLoOOcFmODnWQ@xxxxxxxxxxxxxx/
> >>> Fixing the bug needs more knowledge than what is available in manuals.
> >>> Hopefully Intel can take a look.
> >>>
> >>> Thanks,
> >>> Ian
> >>
> >> So this named patch ("perf metricgroup: Hack a fix for aliases...") is
> >> breaking test #67 on my machine also, which is a broadwell.
> >
> > Thanks for taking a look John. If you want help you can send the
> > output of "perf test 67 -vvv" to me. It is possible Broadwell has
> > similar glitches in the json to Skylake. I tested the original test on
> > server parts as I can access them as cloud machines.
> >
> >> I will have a look, but I was hoping that Ian would have a proper fix
> >> for this on top of ("perf metricgroup: Fix uncore metric expressions"),
> >> which now looks to be merged.
> >
> > I still have these changes to look at in my inbox but I'm assuming
> > they're good :-) Sorry for not getting to them, but it's good they are
> > merged.
>
> Hi Ian,
> Checked in upstream kernel with your fix patch, in powerpc also test case 67 is passing.
> But I am getting issue in test 10 for powerpc
>
> [command]# ./perf test 10
> 10: PMU events :
> 10.1: PMU event table sanity : Ok
> 10.2: PMU event map aliases : Ok
> 10.3: Parsing of PMU event table metrics : Skip (some metrics failed)
> 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!
>
> Was debugging it, issue is with commit e1c92a7fbbc5 perf tests: Add another metric parsing test.
>
> So, there we are passing different runtime parameter value in "expr__find_other and expr__parse"
> in function `metric_parse_fake`. I believe we need to send same value.
> I will send fix patch for the same.
>
> Thanks,
> Kajol Jain

Thanks, the fake support was done by Jiri. I do try to test on Power
8. The awesome thing, aside from the testing nit fixes, is that the
metrics will actually work once the test is passing :-). They may of
course report junk.

Thanks,
Ian

> >
> > Thanks,
> > Ian
> >
> >> Thanks!
> >>
> >>>
> >>>>
> >>>>
> >>>> If you fix the issue, kindly add following tag
> >>>> Reported-by: kernel test robot <rong.a.chen@xxxxxxxxx>
> >>>>
> >>>>
> >>>> 2020-10-16 19:31:52 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf test 67
> >>>> 67: Parse and process metrics : FAILED!
> >>>> 2020-10-16 19:31:52 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf test 68
> >>>> 68: x86 rdpmc : Ok
> >>>> 2020-10-16 19:31:52 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf test 69
> >>>> 69: Convert perf time to TSC : Ok
> >>>> 2020-10-16 19:31:52 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf test 70
> >>>> 70: DWARF unwind : Ok
> >>>> 2020-10-16 19:31:52 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf test 71
> >>>> 71: x86 instruction decoder - new instructions : Ok
> >>>> 2020-10-16 19:31:52 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf test 72
> >>>> 72: Intel PT packet decoder : Ok
> >>>> 2020-10-16 19:31:52 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf test 73
> >>>> 73: x86 bp modify : Ok
> >>>> 2020-10-16 19:31:53 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf test 74
> >>>> 74: probe libc's inet_pton & backtrace it with ping : Ok
> >>>> 2020-10-16 19:31:54 sudo /usr/src/perf_selftests-x86_64-rhel-8.3-fcc9c5243c478f104014daf4d23db86098d2aef0/tools/perf/perf test 75
> >>>> 75: Zstd perf.data compression/decompression : Ok
> >>>>
> >>>>
> >>>>
> >>>> To reproduce:
> >>>>
> >>>> git clone https://github.com/intel/lkp-tests.git
> >>>> cd lkp-tests
> >>>> bin/lkp install job.yaml # job file is attached in this email
> >>>> bin/lkp run job.yaml
> >>>>
> >>>>
> >>>>
> >>>> Thanks,
> >>>> Rong Chen
> >>>>
> >>> .
> >>>
> >>