Re: [PATCH v2] perf sched map: Add command-name, fuzzy-name options to filter the output map
From: Madadi Vineeth Reddy
Date: Tue Jun 11 2024 - 04:47:33 EST
Hi Namhyung,
On 11/06/24 04:25, Namhyung Kim wrote:
> Hello,
>
> On Sat, Jun 08, 2024 at 06:18:29PM +0530, Madadi Vineeth Reddy wrote:
>> By default, perf sched map prints sched-in events for all the tasks
>> which may not be required all the time as it prints lot of symbols
>> and rows to the terminal.
>>
>> With --command-name option, one could specify the specific command(s)
>> for which the map has to be shown. This would help in analyzing the
>> CPU usage patterns easier for that specific command(s). Since multiple
>> PID's might have the same command name, using command-name filter
>> would be more useful for debugging.
>>
>> Multiple command names can be given with a comma separator without
>> whitespace.
>>
>> The --fuzzy-name option can be used if fuzzy name matching is required.
>> For example, "taskname" can be matched to any string that contains
>> "taskname" as its substring.
>>
>> For other tasks, instead of printing the symbol, ** is printed and
>> the same . is used to represent idle. ** is used instead of symbol
>> for other tasks because it helps in clear visualization of command(s)
>> of interest and secondly the symbol itself doesn't mean anything
>> because the sched-in of that symbol will not be printed(first sched-in
>> contains pid and the corresponding symbol).
>>
>> 6.10.0-rc1
>> ==========
>> *A0 213864.670142 secs A0 => migration/0:18
>> *. 213864.670148 secs . => swapper:0
>> . *B0 213864.670217 secs B0 => migration/1:21
>> . *. 213864.670223 secs
>> . . *C0 213864.670247 secs C0 => migration/2:26
>> . . *. 213864.670252 secs
>>
>> 6.10.0-rc1 + patch (--command-name = schbench)
>> =============
>> ** . ** *A0 213864.671055 secs A0 => schbench:104834
>> *B0 . . A0 213864.671156 secs B0 => schbench:104835
>> *C0 . . A0 213864.671187 secs C0 => schbench:104836
>
> I still think some people are interested in sched-out time. For
> example, we don't know when B0 was scheduled out in the above. There
> could be other tasks between B0 and C0 on the CPU 0.
Yes, you are right. When using the --command-name filter, there can be
other tasks in between. This won't be a problem without the --command-name
filtering, as no task will be missed, and we can be sure that the C0 sched-in
time is the B0 sched-out time.
I will add the sched-out time when using the --command-name option in v3.
>
>
>> *D0 . . A0 213864.671219 secs D0 => schbench:104837
>> *E0 . . A0 213864.671250 secs E0 => schbench:104838
>> E0 . *D0 A0
>>
>> This helps in visualizing how a benchmark like schbench is spread over
>> the available cpus while also knowing which cpus are idle(.) and which
>> are not(**). This will be more useful as number of CPUs increase.
>>
>> Signed-off-by: Madadi Vineeth Reddy <vineethr@xxxxxxxxxxxxx>
>> Reviewed-and-tested-by: Athira Rajeev <atrajeev@xxxxxxxxxxxxxxxxxx <mailto:atrajeev@xxxxxxxxxxxxxxxxxx>>
>>
>> ---
>> Changes in v2:
>> - Add support for giving multiple command-names in CSV. (Namhyung Kim)
>> - Add fuzzy name matching option. (Chen Yu)
>> - Add Reviewed-and-tested-by tag from Athira Rajeev.
>> - Rebase against perf-tools-next commit d2307fd4f989 ("perf maps: Add/use
>> a sorted insert for fixup overlap and insert")
>> - Link to v1: https://lore.kernel.org/lkml/20240417152521.80340-1-vineethr@xxxxxxxxxxxxx/
>> ---
>> tools/perf/Documentation/perf-sched.txt | 8 +++++
>> tools/perf/builtin-sched.c | 41 +++++++++++++++++++++++--
>> 2 files changed, 46 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
>> index a216d2991b19..6901c192eb6f 100644
>> --- a/tools/perf/Documentation/perf-sched.txt
>> +++ b/tools/perf/Documentation/perf-sched.txt
>> @@ -130,6 +130,14 @@ OPTIONS for 'perf sched map'
>> --color-pids::
>> Highlight the given pids.
>>
>> +--command-name::
>> + Map output only for the given command name(s). Separate the
>> + command names with a comma (without whitespace).
>> + (** indicates other tasks while . is idle).
>> +
>> +--fuzzy-name::
>> + Given command name can be partially matched (fuzzy matching).
>> +
>> OPTIONS for 'perf sched timehist'
>> ---------------------------------
>> -k::
>> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
>> index 5977c49ae2c7..364f48170e65 100644
>> --- a/tools/perf/builtin-sched.c
>> +++ b/tools/perf/builtin-sched.c
>> @@ -156,6 +156,8 @@ struct perf_sched_map {
>> const char *color_pids_str;
>> struct perf_cpu_map *color_cpus;
>> const char *color_cpus_str;
>> + const char *command;
>> + bool fuzzy;
>> struct perf_cpu_map *cpus;
>> const char *cpus_str;
>> };
>> @@ -1538,6 +1540,26 @@ map__findnew_thread(struct perf_sched *sched, struct machine *machine, pid_t pid
>> return thread;
>> }
>>
>> +static bool command_matches(const char *comm_str, const char *commands, bool fuzzy_match)
>> +{
>> + char *commands_copy = strdup(commands);
>> + char *token = strtok(commands_copy, ",");
>
> Hmm.. copying and parsing the commands whenever it compares the task
> comm looks inefficient. I think you can parse the input string once and
> keep the list of names.
Sure, I will do that.
>
>> +
>> + bool match_found = false;
>> +
>> + while (token != NULL) {
>> + if ((fuzzy_match && strstr(comm_str, token) != NULL) ||
>> + strcmp(comm_str, token) == 0) {
>> + match_found = true;
>> + break;
>> + }
>> + token = strtok(NULL, ",");
>> + }
>
> It could be:
>
> while (token != NULL && !match_found) {
> if (fuzzy_match)
> match_found = !!strstr(comm_str, token);
> else
> match_found = !strcmp(comm_str, token);
>
> token = strtok(NULL, ",");
> }
This looks much better, will change it in v3. Thank you!
>
> But as I said, it'd better not to call strtok() here.
>
Yes, understood.
>> +
>> + free(commands_copy);
>> + return match_found;
>> +}
>> +
>> static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
>> struct perf_sample *sample, struct machine *machine)
>> {
>> @@ -1594,8 +1616,6 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
>>
>> sched->curr_thread[this_cpu.cpu] = thread__get(sched_in);
>>
>> - printf(" ");
>> -
>> new_shortname = 0;
>> if (!tr->shortname[0]) {
>> if (!strcmp(thread__comm_str(sched_in), "swapper")) {
>> @@ -1605,7 +1625,8 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
>> */
>> tr->shortname[0] = '.';
>> tr->shortname[1] = ' ';
>> - } else {
>> + } else if (!sched->map.command || command_matches(thread__comm_str(sched_in),
>> + sched->map.command, sched->map.fuzzy)) {
>
> We usually align the indentation using the open parenthesis.
> Maybe you can rename the function and pass the sched pointer directly
> to reduce the argument.
Sure, got it.
>
> bool sched_match_task(struct perf_sched *sched, const char *comm_str)
> {
> ...
> }
>
> Or you could pass thread instead of comm_str and possibly support
> matching with TID too.
Do you want me to add another command line option to support matching
with TID?
Thanks for all the suggestions. Will implement them and send v3.
Thanks and Regards
Madadi Vineeth Reddy
>
> Thanks,
> Namhyung
>
>
>> tr->shortname[0] = sched->next_shortname1;
>> tr->shortname[1] = sched->next_shortname2;
>>
>> @@ -1618,10 +1639,19 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
>> else
>> sched->next_shortname2 = '0';
>> }
>> + } else {
>> + tr->shortname[0] = '*';
>> + tr->shortname[1] = '*';
>> }
>> new_shortname = 1;
>> }
>>
>> + if (sched->map.command && !command_matches(thread__comm_str(sched_in), sched->map.command,
>> + sched->map.fuzzy))
>> + goto skip;
>> +
>> + printf(" ");
>> +
>> for (i = 0; i < cpus_nr; i++) {
>> struct perf_cpu cpu = {
>> .cpu = sched->map.comp ? sched->map.comp_cpus[i].cpu : i,
>> @@ -1678,6 +1708,7 @@ static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
>> out:
>> color_fprintf(stdout, color, "\n");
>>
>> +skip:
>> thread__put(sched_in);
>>
>> return 0;
>> @@ -3560,6 +3591,10 @@ int cmd_sched(int argc, const char **argv)
>> "highlight given CPUs in map"),
>> OPT_STRING(0, "cpus", &sched.map.cpus_str, "cpus",
>> "display given CPUs in map"),
>> + OPT_STRING(0, "command-name", &sched.map.command, "command",
>> + "map output only for the given command name(s)"),
>> + OPT_BOOLEAN(0, "fuzzy-name", &sched.map.fuzzy,
>> + "given command name can be partially matched (fuzzy matching)"),
>> OPT_PARENT(sched_options)
>> };
>> const struct option timehist_options[] = {
>> --
>> 2.31.1
>>