Re: kprobe: __blkdev_put probe is missed

From: Masami Hiramatsu
Date: Fri Jun 19 2020 - 01:12:46 EST


Hi Ming,

On Fri, 19 Jun 2020 07:19:01 +0800
Ming Lei <ming.lei@xxxxxxxxxx> wrote:

> > I'm using 5.4 on ubuntu and can not reproduce it with kprobe_event.
> >
> > root@devnote2:/sys/kernel/tracing# uname -a
> > Linux devnote2 5.4.0-37-generic #41-Ubuntu SMP Wed Jun 3 18:57:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
> > root@devnote2:/sys/kernel/tracing# echo p __blkdev_put > kprobe_events
> > root@devnote2:/sys/kernel/tracing# echo 1 > events/kprobes/p___blkdev_put_0/enable
> > root@devnote2:/sys/kernel/tracing# cat trace
> > # tracer: nop
> > #
> > # entries-in-buffer/entries-written: 0/0 #P:8
> > #
> > # _-----=> irqs-off
> > # / _----=> need-resched
> > # | / _---=> hardirq/softirq
> > # || / _--=> preempt-depth
> > # ||| / delay
> > # TASK-PID CPU# |||| TIMESTAMP FUNCTION
> > # | | | |||| | |
> > root@devnote2:/sys/kernel/tracing# blockdev --getbsz /dev/nvme0n1
> > 4096
> > root@devnote2:/sys/kernel/tracing# cat trace
> > # tracer: nop
> > #
> > # entries-in-buffer/entries-written: 1/1 #P:8
> > #
> > # _-----=> irqs-off
> > # / _----=> need-resched
> > # | / _---=> hardirq/softirq
> > # || / _--=> preempt-depth
> > # ||| / delay
> > # TASK-PID CPU# |||| TIMESTAMP FUNCTION
> > # | | | |||| | |
> > <...>-111740 [002] .... 301734.476991: p___blkdev_put_0: (__blkdev_put+0x0/0x1e0)
> >
> > Hmm, maybe some issue in the latest kernel...?
>
> Hello Masami,
>
> I am testing the latest upstream kernel, your trace actually reproduces
> this issue.

OK.

>
> After 'blockdev --getbsz /dev/nvme0n1' returns, __blkdev_put() should
> have been called two times(one for partition, and the other for disk),
> however kprobe trace just shows one time of calling this function.
>
> If trace_printk() is added at the entry of __blkdev_put() manually,
> you will see that __blkdev_put() is called two times in 'blockdev
> --getbsz /dev/nvme0n1'.

OK, let me check again on the latest kernel.
Here I tested with qemu.

root@devnote2:/sys/kernel/debug/tracing# uname -a
Linux devnote2 5.8.0-rc1+ #26 SMP PREEMPT Fri Jun 19 12:12:53 JST 2020 x86_64 x86_64 x86_64 GNU/Linux

And we have a (virtual) sda with 1 partition.

root@devnote2:/sys/kernel/debug/tracing# cat /proc/partitions
major minor #blocks name

8 0 10240 sda
8 1 9216 sda1

OK, then let's make events (for sure)

root@devnote2:/sys/kernel/debug/tracing# echo p __blkdev_put >> kprobe_events
root@devnote2:/sys/kernel/debug/tracing# echo r __blkdev_put >> kprobe_events
root@devnote2:/sys/kernel/debug/tracing# echo p blkdev_put >> kprobe_events

There are 3 events in the kernel, blkdev_put() and __blkdev_put() and
the return of __blkdev_put().
Then enable it and access to */dev/sda* (a disk)

root@devnote2:/sys/kernel/debug/tracing# echo 1 > events/kprobes/enable
root@devnote2:/sys/kernel/debug/tracing# blockdev --getbsz /dev/sda
4096
root@devnote2:/sys/kernel/debug/tracing# echo 0 > events/kprobes/enable
root@devnote2:/sys/kernel/debug/tracing# cat trace
# tracer: nop
#
# entries-in-buffer/entries-written: 3/3 #P:8
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
blockdev-185 [002] ...1 72.604266: p_blkdev_put_0: (blkdev_put+0x0/0x130)
blockdev-185 [002] ...1 72.604276: p___blkdev_put_0: (__blkdev_put+0x0/0x220)
blockdev-185 [002] d..2 72.604288: r___blkdev_put_0: (blkdev_put+0x50/0x130 <- __blkdev_put)

So the __blkdev_put() is called once from blkdev_put().
Next, we do same trace with accessing */dev/sda1* (a partition).

root@devnote2:/sys/kernel/debug/tracing# echo > trace
root@devnote2:/sys/kernel/debug/tracing# echo 1 > events/kprobes/enable
root@devnote2:/sys/kernel/debug/tracing# blockdev --getbsz /dev/sda1
4096
root@devnote2:/sys/kernel/debug/tracing# echo 0 > events/kprobes/enable
root@devnote2:/sys/kernel/debug/tracing# cat trace
# tracer: nop
#
# entries-in-buffer/entries-written: 5/5 #P:8
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
blockdev-191 [005] ...1 194.401517: p_blkdev_put_0: (blkdev_put+0x0/0x130)
blockdev-191 [005] ...1 194.401527: p___blkdev_put_0: (__blkdev_put+0x0/0x220)
blockdev-191 [005] ...1 194.401529: p___blkdev_put_0: (__blkdev_put+0x0/0x220)
blockdev-191 [005] d..2 194.401535: r___blkdev_put_0: (__blkdev_put+0x1ea/0x220 <- __blkdev_put)
blockdev-191 [005] d..2 194.401536: r___blkdev_put_0: (blkdev_put+0x50/0x130 <- __blkdev_put)

In this case, we can see the __blkdev_put() is called twice, from
blkdev_put() and __blkdev_put() itself (nested call).

So, it seems that the kprobe event correctly works.

Could you do same thing on your environment, and share the result?

Thank you,

--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>