Re: [PATCH] perf stat: Do not delay the workload with --delay

From: Ian Rogers
Date: Tue Dec 13 2022 - 19:19:55 EST


On Mon, Dec 12, 2022 at 3:08 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> The -D/--delay option is to delay the measure after the program starts.
> But the current code goes to sleep before starting the program so the
> program is delayed too. This is not the intention, let's fix it.
>
> Before:
>
> $ time sudo ./perf stat -a -e cycles -D 3000 sleep 4
> Events disabled
> Events enabled
>
> Performance counter stats for 'system wide':
>
> 4,326,949,337 cycles
>
> 4.007494118 seconds time elapsed
>
> real 0m7.474s
> user 0m0.356s
> sys 0m0.120s
>
> It ran the workload for 4 seconds and gave the 3 second delay. So it
> should skip the first 3 second and measure the last 1 second only. But
> as you can see, it delays 3 seconds and ran the workload after that for
> 4 seconds. So the total time (real) was 7 seconds.
>
> After:
>
> $ time sudo ./perf stat -a -e cycles -D 3000 sleep 4
> Events disabled
> Events enabled
>
> Performance counter stats for 'system wide':
>
> 1,063,551,013 cycles
>
> 1.002769510 seconds time elapsed
>
> real 0m4.484s
> user 0m0.385s
> sys 0m0.086s

The commit message feels like it could almost be turned into a shell
test. The test would need some fudge factors in case of load on the
test system. Any thoughts if we could add this? We wouldn't need to
rely on 'time' as we have tool events of user_time, system_time, etc.

Thanks,
Ian

> The bug was introduced when it changed enablement of system-wide events
> with a command line workload. But it should've considered the initial
> delay case. The code was reworked since then (in bb8bc52e7578) so I'm
> afraid it won't be applied cleanly.
>
> Fixes: d0a0a511493d ("perf stat: Fix forked applications enablement of counters")
> Cc: Sumanth Korikkar <sumanthk@xxxxxxxxxxxxx>
> Cc: Thomas Richter <tmricht@xxxxxxxxxxxxx>
> Reported-by: Kevin Nomura <nomurak@xxxxxxxxxx>
> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> ---
> tools/perf/builtin-stat.c | 33 +++++++++++++++++----------------
> 1 file changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index d040fbcdcc5a..b39bf785a16e 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -540,26 +540,14 @@ static int enable_counters(void)
> return err;
> }
>
> - if (stat_config.initial_delay < 0) {
> - pr_info(EVLIST_DISABLED_MSG);
> - return 0;
> - }
> -
> - if (stat_config.initial_delay > 0) {
> - pr_info(EVLIST_DISABLED_MSG);
> - usleep(stat_config.initial_delay * USEC_PER_MSEC);
> - }
> -
> /*
> * We need to enable counters only if:
> * - we don't have tracee (attaching to task or cpu)
> * - we have initial delay configured
> */
> - if (!target__none(&target) || stat_config.initial_delay) {
> + if (!target__none(&target)) {
> if (!all_counters_use_bpf)
> evlist__enable(evsel_list);
> - if (stat_config.initial_delay > 0)
> - pr_info(EVLIST_ENABLED_MSG);
> }
> return 0;
> }
> @@ -930,14 +918,27 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
> return err;
> }
>
> - err = enable_counters();
> - if (err)
> - return -1;
> + if (stat_config.initial_delay) {
> + pr_info(EVLIST_DISABLED_MSG);
> + } else {
> + err = enable_counters();
> + if (err)
> + return -1;
> + }
>
> /* Exec the command, if any */
> if (forks)
> evlist__start_workload(evsel_list);
>
> + if (stat_config.initial_delay > 0) {
> + usleep(stat_config.initial_delay * USEC_PER_MSEC);
> + err = enable_counters();
> + if (err)
> + return -1;
> +
> + pr_info(EVLIST_ENABLED_MSG);
> + }
> +
> t0 = rdclock();
> clock_gettime(CLOCK_MONOTONIC, &ref_time);
>
> --
> 2.39.0.rc1.256.g54fd8350bd-goog
>