Re: [RFCv2 00/48] perf tools: Add threads to record command

From: Namhyung Kim
Date: Fri Oct 05 2018 - 02:14:36 EST


Hi,

Sorry for late..

On Mon, Sep 24, 2018 at 09:32:11PM +0300, Alexey Budankov wrote:
> On 24.09.2018 17:29, Jiri Olsa wrote:
> > On Mon, Sep 24, 2018 at 04:09:09PM +0300, Alexey Budankov wrote:
> >> Command:
> >>
> >> /usr/bin/time ./perf.thr record --threads=T \
> >> -N -B -T -R --call-graph dwarf,1024 --user-regs=ip,bp,sp \
> >> -e cpu/period=P,event=0x3c/Duk,\
> >> cpu/period=P,umask=0x3/Duk,\
> >> cpu/period=P,event=0xc0/Duk,\
> >> cpu/period=0xaae61,event=0xc2,umask=0x10/uk,\
> >> cpu/period=0x11171,event=0xc2,umask=0x20/uk,\
> >> cpu/period=0x11171,event=0xc2,umask=0x40/uk \
> >> --clockid=monotonic_raw -- ./matrix.gcc
> >>
> >> Workload: matrix multiplication in 128 threads
> >>
> >> T : 272
> >> P (period, ms) : 0.35
> >> runtime overhead (%) : 13x ~ 87.73 / 6.81
> >> data loss (%) : 0
> >> LOST events : 36
> >> SAMPLE events : 8048542
> >> perf.data size (GiB) : 10
> >
> > any idea why does it have some much more samples?
>
> Presumably, this is because period is 350us and this is the smallest
> one that perf.thr manages to capture data without data loss (=0) when T=272.
> However, during collection, I get message that max sampling frequency
> is lowered to 3KHz.

And it took much longer than AIO: 87.73 vs 22.34 (N=272)

Thanks,
Namhyung