Re: [PATCH v2] perf sched map: Add command-name, fuzzy-name options to filter the output map

From: Namhyung Kim
Date: Mon Jun 10 2024 - 18:55:20 EST


Hello,

On Sat, Jun 08, 2024 at 06:18:29PM +0530, Madadi Vineeth Reddy wrote:
> By default, perf sched map prints sched-in events for all the tasks
> which may not be required all the time as it prints lot of symbols
> and rows to the terminal.
>
> With --command-name option, one could specify the specific command(s)
> for which the map has to be shown. This would help in analyzing the
> CPU usage patterns easier for that specific command(s). Since multiple
> PID's might have the same command name, using command-name filter
> would be more useful for debugging.
>
> Multiple command names can be given with a comma separator without
> whitespace.
>
> The --fuzzy-name option can be used if fuzzy name matching is required.
> For example, "taskname" can be matched to any string that contains
> "taskname" as its substring.
>
> For other tasks, instead of printing the symbol, ** is printed and
> the same . is used to represent idle. ** is used instead of symbol
> for other tasks because it helps in clear visualization of command(s)
> of interest and secondly the symbol itself doesn't mean anything
> because the sched-in of that symbol will not be printed(first sched-in
> contains pid and the corresponding symbol).
>
> 6.10.0-rc1
> ==========
> *A0 213864.670142 secs A0 => migration/0:18
> *. 213864.670148 secs . => swapper:0
> . *B0 213864.670217 secs B0 => migration/1:21
> . *. 213864.670223 secs
> . . *C0 213864.670247 secs C0 => migration/2:26
> . . *. 213864.670252 secs
>
> 6.10.0-rc1 + patch (--command-name = schbench)
> =============
> ** . ** *A0 213864.671055 secs A0 => schbench:104834
> *B0 . . A0 213864.671156 secs B0 => schbench:104835
> *C0 . . A0 213864.671187 secs C0 => schbench:104836

I still think some people are interested in sched-out time. For
example, we don't know when B0 was scheduled out in the above. There
could be other tasks between B0 and C0 on the CPU 0.


> *D0 . . A0 213864.671219 secs D0 => schbench:104837
> *E0 . . A0 213864.671250 secs E0 => schbench:104838
> E0 . *D0 A0
>
> This helps in visualizing how a benchmark like schbench is spread over
> the available cpus while also knowing which cpus are idle(.) and which
> are not(**). This will be more useful as number of CPUs increase.
>
> Signed-off-by: Madadi Vineeth Reddy <vineethr@xxxxxxxxxxxxx>
> Reviewed-and-tested-by: Athira Rajeev <atrajeev@xxxxxxxxxxxxxxxxxx <mailto:atrajeev@xxxxxxxxxxxxxxxxxx>>
>
> ---
> Changes in v2:
> - Add support for giving multiple command-names in CSV. (Namhyung Kim)
> - Add fuzzy name matching option. (Chen Yu)
> - Add Reviewed-and-tested-by tag from Athira Rajeev.
> - Rebase against perf-tools-next commit d2307fd4f989 ("perf maps: Add/use
> a sorted insert for fixup overlap and insert")
> - Link to v1: https://lore.kernel.org/lkml/20240417152521.80340-1-vineethr@xxxxxxxxxxxxx/
> ---
> tools/perf/Documentation/perf-sched.txt | 8 +++++
> tools/perf/builtin-sched.c | 41 +++++++++++++++++++++++--
> 2 files changed, 46 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
> index a216d2991b19..6901c192eb6f 100644
> --- a/tools/perf/Documentation/perf-sched.txt
> +++ b/tools/perf/Documentation/perf-sched.txt
> @@ -130,6 +130,14 @@ OPTIONS for 'perf sched map'
> --color-pids::
> Highlight the given pids.
>
> +--command-name::
> + Map output only for the given command name(s). Separate the
> + command names with a comma (without whitespace).
> + (** indicates other tasks while . is idle).
> +
> +--fuzzy-name::
> + Given command name can be partially matched (fuzzy matching).
> +
> OPTIONS for 'perf sched timehist'
> ---------------------------------
> -k::
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 5977c49ae2c7..364f48170e65 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -156,6 +156,8 @@ struct perf_sched_map {
> const char *color_pids_str;
> struct perf_cpu_map *color_cpus;
> const char *color_cpus_str;
> + const char *command;
> + bool fuzzy;
> struct perf_cpu_map *cpus;
> const char *cpus_str;
> };
> @@ -1538,6 +1540,26 @@ map__findnew_thread(struct perf_sched *sched, struct machine *machine, pid_t pid
> return thread;
> }
>
> +static bool command_matches(const char *comm_str, const char *commands, bool fuzzy_match)
> +{
> + char *commands_copy = strdup(commands);
> + char *token = strtok(commands_copy, ",");

Hmm.. copying and parsing the commands whenever it compares the task
comm looks inefficient. I think you can parse the input string once and
keep the list of names.

> +
> + bool match_found = false;
> +
> + while (token != NULL) {
> + if ((fuzzy_match && strstr(comm_str, token) != NULL) ||
> + strcmp(comm_str, token) == 0) {
> + match_found = true;
> + break;
> + }
> + token = strtok(NULL, ",");
> + }

It could be:

while (token != NULL && !match_found) {
if (fuzzy_match)
match_found = !!strstr(comm_str, token);
else
match_found = !strcmp(comm_str, token);

token = strtok(NULL, ",");
}

But as I said, it'd better not to call strtok() here.

> +
> + free(commands_copy);
> + return match_found;
> +}
> +
> static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
> struct perf_sample *sample, struct machine *machine)
> {
> @@ -1594,8 +1616,6 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
>
> sched->curr_thread[this_cpu.cpu] = thread__get(sched_in);
>
> - printf(" ");
> -
> new_shortname = 0;
> if (!tr->shortname[0]) {
> if (!strcmp(thread__comm_str(sched_in), "swapper")) {
> @@ -1605,7 +1625,8 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
> */
> tr->shortname[0] = '.';
> tr->shortname[1] = ' ';
> - } else {
> + } else if (!sched->map.command || command_matches(thread__comm_str(sched_in),
> + sched->map.command, sched->map.fuzzy)) {

We usually align the indentation using the open parenthesis.
Maybe you can rename the function and pass the sched pointer directly
to reduce the argument.

bool sched_match_task(struct perf_sched *sched, const char *comm_str)
{
...
}

Or you could pass thread instead of comm_str and possibly support
matching with TID too.

Thanks,
Namhyung


> tr->shortname[0] = sched->next_shortname1;
> tr->shortname[1] = sched->next_shortname2;
>
> @@ -1618,10 +1639,19 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
> else
> sched->next_shortname2 = '0';
> }
> + } else {
> + tr->shortname[0] = '*';
> + tr->shortname[1] = '*';
> }
> new_shortname = 1;
> }
>
> + if (sched->map.command && !command_matches(thread__comm_str(sched_in), sched->map.command,
> + sched->map.fuzzy))
> + goto skip;
> +
> + printf(" ");
> +
> for (i = 0; i < cpus_nr; i++) {
> struct perf_cpu cpu = {
> .cpu = sched->map.comp ? sched->map.comp_cpus[i].cpu : i,
> @@ -1678,6 +1708,7 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
> out:
> color_fprintf(stdout, color, "\n");
>
> +skip:
> thread__put(sched_in);
>
> return 0;
> @@ -3560,6 +3591,10 @@ int cmd_sched(int argc, const char **argv)
> "highlight given CPUs in map"),
> OPT_STRING(0, "cpus", &sched.map.cpus_str, "cpus",
> "display given CPUs in map"),
> + OPT_STRING(0, "command-name", &sched.map.command, "command",
> + "map output only for the given command name(s)"),
> + OPT_BOOLEAN(0, "fuzzy-name", &sched.map.fuzzy,
> + "given command name can be partially matched (fuzzy matching)"),
> OPT_PARENT(sched_options)
> };
> const struct option timehist_options[] = {
> --
> 2.31.1
>