Re: [PATCH 0/5] Support metric group constraint
From: Jiri Olsa
Date:  Thu Feb 20 2020 - 06:39:38 EST
On Wed, Feb 19, 2020 at 11:08:35AM -0800, kan.liang@xxxxxxxxxxxxxxx wrote:
> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> 
> Some metric groups, e.g. Page_Walks_Utilization, will never count when
> NMI watchdog is enabled.
> 
>  $echo 1 > /proc/sys/kernel/nmi_watchdog
>  $perf stat -M Page_Walks_Utilization
> 
>  Performance counter stats for 'system wide':
> 
>  <not counted>      itlb_misses.walk_pending       (0.00%)
>  <not counted>      dtlb_load_misses.walk_pending  (0.00%)
>  <not counted>      dtlb_store_misses.walk_pending (0.00%)
>  <not counted>      ept.walk_pending               (0.00%)
>  <not counted>      cycles                         (0.00%)
> 
>        2.343460588 seconds time elapsed
> 
>  Some events weren't counted. Try disabling the NMI watchdog:
>         echo 0 > /proc/sys/kernel/nmi_watchdog
>         perf stat ...
>         echo 1 > /proc/sys/kernel/nmi_watchdog
>  The events in group usually have to be from the same PMU. Try
>  reorganizing the group.
> 
> A metric group is a weak group, which relies on group validation
> code in the kernel to determine whether to be opened as a group or
> a non-group. However, group validation code may return false-positives,
> especially when NMI watchdog is enabled. (The metric group is allowed
> as a group but will never be scheduled.)
> 
> The attempt to fix the group validation code has been rejected.
> https://lore.kernel.org/lkml/20200117091341.GX2827@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> Because we cannot accurately predict whether the group can be scheduled
> as a group, only by checking current status.
> 
> This patch set provides another solution to mitigate the issue.
> Add "MetricConstraint" in event list, which provides a hint for perf tool,
> e.g. "MetricConstraint": "NO_NMI_WATCHDOG". Perf tool can change the
> metric group to non-group (standalone metrics) if NMI watchdog is enabled.
the problem is in the missing counter, that's taken by NMI watchdog, right?
and it's problem for any metric that won't fit to the available
counters.. shouldn't we rather do this workaround for any metric
that wouldn't fit in available counters? not just for chosen
ones..?
thanks,
jirka