Re: [PATCH v4] perf record: collect user registers set jointly with dwarf stacks

From: Alexey Budankov
Date: Thu May 30 2019 - 04:28:28 EST



On 29.05.2019 22:25, Arnaldo Carvalho de Melo wrote:
> Em Wed, May 29, 2019 at 05:30:49PM +0300, Alexey Budankov escreveu:
>>
<SNIP>
>> ---
>> tools/perf/util/evsel.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index a6f572a40deb..426dfefeecda 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -669,6 +669,9 @@ int perf_evsel__group_desc(struct perf_evsel *evsel, char *buf, size_t size)
>> return ret;
>> }
>>
>> +#define DWARF_REGS_MASK ((1ULL << PERF_REG_IP) | \
>> + (1ULL << PERF_REG_SP))
>> +
>> static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
>> struct record_opts *opts,
>> struct callchain_param *param)
>> @@ -702,7 +705,13 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
>> if (!function) {
>> perf_evsel__set_sample_bit(evsel, REGS_USER);
>> perf_evsel__set_sample_bit(evsel, STACK_USER);
>> - attr->sample_regs_user |= PERF_REGS_MASK;
>> + if (opts->sample_user_regs) {
>
> Where are you checking that opts->sample_user_regs doesn't have either
> IP or SP?

Sure. The the intention was to avoid such a complication, merge two
masks and provide explicit warning that the resulting mask is extended.

If you still see the checking and auto detection of the exact mask
extension as essential it can be implemented.

>
> So, __perf_evsel__config_callchain its the routine that sets up the
> attr->sample_regs_user when callchains are asked for, and what was it
> doing? Asking for _all_ user regs, right?
>
> I.e. what you're saying is that when --callgraph-dwarf is asked for,
> then only IP and BP are needed, and we should stop doing that, so that
> would be a first patch, if that is the case. I.e. a patch that doesn't
> even mention opts->sample_user_regs.
>
> Then, a second patch would fix the opt->sample_user_regs request clash
> with --callgraph dwarf, i.e. it would do something like:
>
> if ((opts->sample_regs_user & DWARF_REGS_MASK) != DWARF_REGS_MASK) {
> char * ip = (opts->sample_regs_user & (1ULL << PERF_REG_IP)) ? NULL : "IP",
> * sp = (opts->sample_regs_user & (1ULL << PERF_REG_SP)) ? NULL : "SP",
> * all = (!ip && !sp) ? "s" : "";
>
> pr_warning("WARNING: specified --user-regs register set doesn't include register%s "
> "needed by also specified --call-graph=dwarf, auto adding %s%s%s register%s.\n",
> all, ip, all : ", " : "", sp, all);
> }
>
> This if and only if all the registers that are needed to do DWARF
> unwinding are just IP and BP, which doesn't look like its true, since
> when no --user_regs is set (i.e. opts->user_regs is not set) then we
> continue asking for PERF_REGS_MASK...
>
> Can you check where I'm missing something?

1. -g call-graph dwarf,K full_regs
2. --user-regs=user_regs user_regs
3. -g call-graph dwarf,K --user-regs=user_regs user_regs + dwarf_regs

The default behavior stays the same for cases 1, 2 above.
For case 3 register set becomes the one asked using --user_regs option.
If the option value misses IP or SP or the both then they are explicitly
added to the option value and a warning message mentioning the exact
added registers is provided.

>
> Jiri DWARF unwind uses just IP and SP? Looking at
> tools/perf/util/unwind-libunwind-local.c's access_reg() I don't think
> so, right?

If you ask me, AFAIK, DWARF unwind rules sometimes can refer additional
general purpose registers for frames boundaries calculation.

Thanks,
Alexey

>
> - Arnaldo
>
>> + attr->sample_regs_user |= DWARF_REGS_MASK;
>> + pr_warning("WARNING: specified --user-regs register set doesn't include registers "
>> + "needed by also specified --call-graph=dwarf, auto adding IP, SP registers.\n");
>> + } else {
>> + attr->sample_regs_user |= PERF_REGS_MASK;
>> + }
>> attr->sample_stack_user = param->dump_size;
>> attr->exclude_callchain_user = 1;
>> } else {
>> --
>> 2.20.1
>