callchain ABI change with commit 6cbc304f2f360

From: Stephane Eranian
Date: Tue May 05 2020 - 23:37:54 EST


Hi,

I have received reports from users who have noticed a change of
behaviour caused by
commit:

6cbc304f2f360 ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")

When using PEBS sampling on Intel processors.

Doing simple profiling with:
$ perf record -g -e cycles:pp ...

Before:

1 1595951041120856 0x7f77f8 [0xe8]: PERF_RECORD_SAMPLE(IP, 0x4002):
795385/690513: 0x558aa66a9607 period: 10000019 addr: 0
... FP chain: nr:22
..... 0: fffffffffffffe00
..... 1: 0000558aa66a9607
..... 2: 0000558aa66a8751
..... 3: 0000558a984a3d4f

Entry 1: matches sampled IP 0x558aa66a9607.

After:

3 487420973381085 0x2f797c0 [0x90]: PERF_RECORD_SAMPLE(IP, 0x4002):
349591/146458: 0x559dcd2ef889 period: 10000019 addr: 0
... FP chain: nr:11
..... 0: fffffffffffffe00
..... 1: 0000559dcd2ef88b
..... 2: 0000559dcd19787d
..... 3: 0000559dcd1cf1be

entry 1 does not match sampled IP anymore.

Before the patch the kernel was stashing the sampled IP from PEBS into
the callchain. After the patch it is stashing the interrupted IP, thus
with the skid.

I am trying to understand whether this is an intentional change or not
for the IP.

It seems that stashing the interrupted IP would be more consistent across all
sampling modes, i.e., with and without PEBS. Entry 1: would always be
the interrupted IP.
The changelog talks about ORC unwinder being more happy this the
interrupted machine
state, but not about the ABI expectation here.
Could you clarify?
Thanks.