Re: [PATCH v2] perf tool: Fix ppid for synthesized fork events

From: Don Zickus
Date: Fri Mar 27 2015 - 09:10:16 EST


On Thu, Mar 26, 2015 at 03:37:29PM -0600, David Ahern wrote:
> On 3/26/15 3:11 PM, Don Zickus wrote:
> >Sorry for drawing this out. Originally the performance still seemed off.
> >But as we split the patch up to see where the perf impact was, the problem
> >seemed to have disappeared. So we are testing the original patch again.
> >
> >The only difference now is we were playing with the -BN option in perf based
> >on your changelog, just because we never used it before. :-)
>
> I was beyond surprised that you were measuring a 50% hit with the
> first patch. As mentioned in a previous response it only adds the
> processing of 3 additional lines to the already opened and read
> /proc/pid/status file. So, when I wrote this second version I wanted
> to make sure we are only measuring the impact of this change. The
> /proc/pid/status files are read on startup of the record -- before
> any samples are taken.
>
> The intent of '-e cpu-clock -F 1000 -- usleep 1' is to avoid any
> samples since we don't care about them. Really the -a should be
> dropped as well -- no need to open per-cpu events.
>
> -B impacts processing done at the end of the run:
>
> builin-record.c, __cmd_record():
>
> if (!rec->no_buildid)
> process_buildids(rec);
>
> and -N says don't copying anything to ~/.debug. All together it
> tries to focus the measurement to /proc walking.
>
> >
> >One last test without the -BN option and if that looks fine, then we have no
> >objections. Again sorry for dragging this out. I will let you know
> >tomorrow EST.
>
> no problem; appreciate the heads up.

I talked with Joe on my way out the door yesterday and he confirmed, just
removing -BN from our test showed a performance hit with your patch. With
the -BN option, there is no performance hit and we are perfectly fine with
your patch.

So, I guess I am confused how the -BN and your patch could change behaviour.

Just to re-iterate what we did, Joe kicked off a specJBB run and he did 20
captures of two runs (one with the unpatched binary and one with a pached
binary).

for i in {1..20}
do
time perf.unpatched mem record -a -e cpu/mem-loads,ldlat=50/pp -e cpu/mem-stores/pp sleep 10
time perf.patched mem record -a -e cpu/mem-loads,ldlat=50/pp -e cpu/mem-stores/pp sleep 10
done

then we repeat the above test but with -BN in both runs. We compare the
log sizes to make sure they are similar for the random snapshots and compare
the times. With the -BN option, the times are generally within +/- 0.5
seconds of each. Without the -BN option the patched perf binary is
generally +20-40 seconds slower.



However, based on your description above about what the -BN option does, I
am scratching my head about our results. Thoughts?

Cheers,
Don

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/