Re: tracing child threads with address filtering using intel_pt in perf

From: Mansour Alharthi
Date: Tue Oct 09 2018 - 02:31:15 EST


Thank you Alex for the prompt response and fix!

it works perfectly now..

Mansour..


On 10/08/2018 10:25 AM, Alexander Shishkin wrote:
> Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx> writes:
>
>> "Alharthi, Mansour A" <mansourah@xxxxxxxxxx> writes:
>>
>>> Hello all,
>> Hi,
>>
>>> Assume this test code:
>>>
>>> thread_start(){
>>> ...
>>> test();
>>> ...
>>> }
>>>
>>> test(){
>>> printf("test");
>>> }
>>>
>>> main(){
>>> ...
>>> pthread_create(......, thread_start,....);
>>> }
>> Can you include the complete test case code?
>>
>>> Tracing the above program with the following command:
>>> perf record -v -m 512,10000 -e intel_pt//u -T --switch-events --filter
>>> 'filter * @ ./test' -- ./test
>> Can you run it with -vvv and also include its output?
> Scratch that. Instead, can you try the below patch and see if it works
> for you?
>
> Thanks,
> --
> Alex
>
> From 029a726b63ed6ebef527393704c83dab9c76fb9a Mon Sep 17 00:00:00 2001
> From: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
> Date: Mon, 8 Oct 2018 17:16:30 +0300
> Subject: [PATCH] perf: Copy parent's address filter offsets on clone
>
> When a child event is allocated in the inherit_event() path, the VMA
> based filter offsets are not copied from the parent, even though the
> address space mapping of the new task remains the same, which leads
> to no trace for the new task until exec.
>
> Signed-off-by: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
> ---
> kernel/events/core.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index c80549bf82c6..8cecbd61cd90 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -1254,6 +1254,7 @@ static void put_ctx(struct perf_event_context *ctx)
> * perf_event_context::lock
> * perf_event::mmap_mutex
> * mmap_sem
> + * perf_addr_filters_head::lock
> *
> * cpu_hotplug_lock
> * pmus_lock
> @@ -10058,6 +10059,20 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
> goto err_per_task;
> }
>
> + /*
> + * Clone the parent's vma offsets: they are valid until exec()
> + * even if the mm is not shared with the parent.
> + */
> + if (event->parent) {
> + struct perf_addr_filters_head *ifh = perf_event_addr_filters(event);
> +
> + raw_spin_lock_irq(&ifh->lock);
> + memcpy(event->addr_filters_offs,
> + event->parent->addr_filters_offs,
> + pmu->nr_addr_filters * sizeof(unsigned long));
> + raw_spin_unlock_irq(&ifh->lock);
> + }
> +
> /* force hw sync on the address filters */
> event->addr_filters_gen = 1;
> }