Re: [PATCH 03/10] perf sched timehist: Handle zero sample->tid properly

From: Namhyung Kim
Date: Thu Dec 08 2016 - 09:20:24 EST

On Wed, Dec 07, 2016 at 11:06:29AM +0900, Namhyung Kim wrote:
> On Mon, Dec 05, 2016 at 07:52:57PM -0800, David Ahern wrote:
> > On 12/5/16 7:40 PM, Namhyung Kim wrote:
> > > Sometimes samples have tid of 0 but non-0 pid. It ends up having a
> >
> > Any idea how that happens?
> It seems that an exiting task wakes up its parent and the parent might
> call wait(2) concurrently. So at the time of calling last schedule(),
> its pid (and tgid) link might be unhashed by the parent and can have 0
> sample->tid and/or sample->pid depending on timing IMHO. Not sure
> anything guarantees that the sample tid/pid is preserved during the
> event. From a quick look I couldn't find..

I found following line in my data file.

$ perf script --time 52460.536907,
swapper 0 [000] 52460.536907: sched:sched_switch: pool:3775 [120] x ==> at-spi2-registr:1961 [120]
7f3963 __schedule (/lib/modules/4.8.3-1-ARCH/build/vmlinux)
7f3d7c schedule (/lib/modules/4.8.3-1-ARCH/build/vmlinux)
280b14 do_exit (/lib/modules/4.8.3-1-ARCH/build/vmlinux)
280fa7 [unknown] (/lib/modules/4.8.3-1-ARCH/build/vmlinux)
7f7cf2 entry_SYSCALL_64_fastpath (/lib/modules/4.8.3-1-ARCH/build/vmlinux)

As you can see task 3775 called schedule() during exit but sample tid
was 0. In this case prev_pid is still correct (though it's always pid
in the root namespace) since its number was saved in task_struct at the
time of creation and not changed.

I'll change to check prev_pid for idle task.