Re: [PATCH v1 04/15] perf, tools: Support weak groups
From: Jiri Olsa
Date: Wed Aug 02 2017 - 03:35:13 EST
On Mon, Jul 24, 2017 at 04:40:04PM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@xxxxxxxxxxxxxxx>
>
> Setting up groups can be complicated due to the
> complicated scheduling restrictions of different PMUs.
> User tools usually don't understand all these restrictions.
> Still in many cases it is useful to set up groups and
> they work most of the time. However if the group
> is set up wrong some members will not reported any values
> because they never get scheduled.
>
> Add a concept of a 'weak group': try to set up a group,
> but if it's not schedulable fallback to not using
> a group. That gives us the best of both worlds:
> groups if they work, but still a usable fallback if they don't.
>
> In theory it would be possible to have more complex fallback
> strategies (e.g. try to split the group in half), but
> the simple fallback of not using a group seems to work for now.
>
> So far the weak group is only implemented for perf stat,
> not for record.
>
> Here's an unschedulable group (on IvyBridge with SMT on)
>
> % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}' -a sleep 1
>
> 73,806,067 branches
> 4,848,144 branch-misses # 6.57% of all branches
> 14,754,458 l1d.replacement
> 24,905,558 l2_lines_in.all
> <not supported> l2_rqsts.all_code_rd <------- will never report anything
>
> With the weak group:
>
> % perf stat -e '{branches,branch-misses,l1d.replacement,l2_lines_in.all,l2_rqsts.all_code_rd}:W' -a sleep 1
>
> 125,366,055 branches (80.02%)
> 9,208,402 branch-misses # 7.35% of all branches (80.01%)
> 24,560,249 l1d.replacement (80.00%)
> 43,174,971 l2_lines_in.all (80.05%)
> 31,891,457 l2_rqsts.all_code_rd (79.92%)
looks handy, few comments below
thanks,
jirka
>
> The extra event scheduled with some extra multiplexing
>
> Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> ---
SNIP
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 97d6b6c42014..551ed938e05c 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -564,7 +564,7 @@ static int __run_perf_stat(int argc, const char **argv)
> int interval = stat_config.interval;
> char msg[BUFSIZ];
> unsigned long long t0, t1;
> - struct perf_evsel *counter;
> + struct perf_evsel *counter, *c2, *leader;
> struct timespec ts;
> size_t l;
> int status = 0;
> @@ -595,6 +595,32 @@ static int __run_perf_stat(int argc, const char **argv)
> evlist__for_each_entry(evsel_list, counter) {
> try_again:
> if (create_perf_stat_counter(counter) < 0) {
> + /* Weak group failed. Reset the group. */
> + if (errno == EINVAL &&
> + counter->leader != counter &&
> + counter->weak_group) {
could you please put this de-grouping code into a function?
> + bool is_open = true;
> +
> + pr_debug("Weak group for %s/%d failed\n",
> + counter->leader->name, counter->nr_members);
> + leader = counter->leader;
> + evlist__for_each_entry(evsel_list, c2) {
we have for_each_group_member
> + if (c2 == counter)
> + is_open = false;
> + if (c2->leader == leader) {
> + if (is_open)
> + perf_evsel__close(c2,
> + c2->cpus ? c2->cpus->nr :
> + cpu_map__nr(evsel_list->cpus),
> + thread_map__nr(evsel_list->threads));
> + c2->leader = c2;
> + c2->nr_members = 0;
> + }
> + }
> + counter = leader;
> + goto try_again;
> + }
> +
> /*
> * PPC returns ENXIO for HW counters until 2.6.37
> * (behavior changed with commit b0a873e).
SNIP