Re: [PATCH v3 2/2] perf/sdt: Directly record SDT events with 'perf record'

From: Brendan Gregg
Date: Tue Feb 28 2017 - 17:32:08 EST


G'Day Ravi,

On Thu, Feb 23, 2017 at 11:43 PM, Ravi Bangoria
<ravi.bangoria@xxxxxxxxxxxxxxxxxx> wrote:
>
> From: Hemant Kumar <hemant@xxxxxxxxxxxxxxxxxx>
>
> Add support for directly recording SDT events which are present in
> the probe cache. Without this patch, we could probe into SDT events
> using 'perf probe' and 'perf record'. With this patch, we can probe
> the SDT events directly using 'perf record'.
>
> For example :
>
> $ perf list sdt
> sdt_libpthread:mutex_entry [SDT event]
> sdt_libc:setjmp [SDT event]
> ...
>
> $ perf record -a -e sdt_libc:setjmp
> ^C[ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.286 MB perf.data (1065 samples) ]
>
> $ perf script
> bash 803 [002] 6492.190311: sdt_libc:setjmp: (7f1d503b56a1)
> login 488 [001] 6496.791583: sdt_libc:setjmp: (7ff3013d56a1)
> fprintd 11038 [003] 6496.808032: sdt_libc:setjmp: (7fdedf5936a1)
> [...]


Thanks, I like the usage. I ran into trouble testing on Node.js:

# ./perf buildid-cache --add `which node`
# ./perf list | grep sdt_node
sdt_node:gc__done [SDT event]
sdt_node:gc__start [SDT event]
sdt_node:http__client__request [SDT event]
sdt_node:http__client__response [SDT event]
sdt_node:http__server__request [SDT event]
sdt_node:http__server__response [SDT event]
sdt_node:net__server__connection [SDT event]
sdt_node:net__stream__end [SDT event]
# ./perf record -e sdt_node:http__server__request -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.308 MB perf.data ]
# ./perf script
#

No events. I can see it had set it up:

# cat /sys/kernel/debug/tracing/uprobe_events
p:sdt_node/http__server__request /usr/local/bin/node:0x00000000009c2e69

Ok. Am I sure my workload is working? Trying from bcc/eBPF:

# /mnt/src/bcc/tools/trace.py 'u:node:http__server__request'
failed to enable probe 'http__server__request'; a possible cause can
be that the probe requires a pid to enable
# /mnt/src/bcc/tools/trace.py -p `pgrep node` 'u:node:http__server__request'
In file included from /virtual/main.c:41:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/blkdev.h:9:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/genhd.h:64:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/device.h:24:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/pinctrl/devinfo.h:21:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/pinctrl/consumer.h:17:
In file included from
/lib/modules/4.10.0-rc8-virtual/build/include/linux/seq_file.h:10:
/lib/modules/4.10.0-rc8-virtual/build/include/linux/fs.h:2648:9:
warning: comparison of unsigned enum expression < 0 is always false
[-Wtautological-compare]
if (id < 0 || id >= READING_MAX_ID)
~~ ^ ~
1 warning generated.
PID TID COMM FUNC
7646 7646 node http__server__request
7646 7646 node http__server__request
7646 7646 node http__server__request
^C

(ignore the warning; I just asked lkml about it). So that works. It
instrumented:

# cat /sys/kernel/debug/tracing/uprobe_events
p:uprobes/p__usr_local_bin_node_0x9c2e69_bcc_25410
/usr/local/bin/node:0x00000000009c2e69

Now retrying perf:

# ./perf record -e sdt_node:http__server__request -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.446 MB perf.data (3 samples) ]
# ./perf script
node 7646 [002] 361.012364:
sdt_node:http__server__request: (dc2e69)
node 7646 [002] 361.204718:
sdt_node:http__server__request: (dc2e69)
node 7646 [002] 361.363043:
sdt_node:http__server__request: (dc2e69)

Now perf works.

If I restart the node process, it goes back to the broken state.

This doesn't fix it either:

# ./perf probe sdt_node:http__server__request
Added new event:
sdt_node:http__server__request (on %http__server__request in
/usr/local/bin/node)

You can now use it in all perf tools, such as:

perf record -e sdt_node:http__server__request -aR sleep 1

Hint: SDT event can be directly recorded with 'perf record'. No need
to create probe manually.

Brendan