Re: [PATCH] perf data: Allow to use stdio functions for pipe mode

From: Namhyung Kim
Date: Fri Oct 30 2020 - 01:34:46 EST


Hi Jiri,

On Thu, Oct 29, 2020 at 8:57 PM Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
>
> On Wed, Oct 28, 2020 at 05:56:32PM +0900, Namhyung Kim wrote:
> > When perf data is in a pipe, it reads each event separately using
> > read(2) syscall. This is a huge performance bottleneck when
> > processing large data like in perf inject. Also perf inject needs to
> > use write(2) syscall for the output.
> >
> > So convert it to use buffer I/O functions in stdio library for pipe
> > data. This makes inject-build-id bench time drops from 20ms to 8ms.
> >
> > $ perf bench internals inject-build-id
> > # Running 'internals/inject-build-id' benchmark:
> > Average build-id injection took: 8.074 msec (+- 0.013 msec)
> > Average time per event: 0.792 usec (+- 0.001 usec)
> > Average memory usage: 8328 KB (+- 0 KB)
> > Average build-id-all injection took: 5.490 msec (+- 0.008 msec)
> > Average time per event: 0.538 usec (+- 0.001 usec)
> > Average memory usage: 7563 KB (+- 0 KB)
> >
> > This patch enables it just for perf inject when used with pipe (it's a
> > default behavior). Maybe we could do it for perf record and/or report
> > later..
> >
> > Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> > ---
> > tools/perf/builtin-inject.c | 2 ++
> > tools/perf/util/data.c | 36 +++++++++++++++++++++++++++++++++---
> > tools/perf/util/data.h | 11 ++++++++++-
> > tools/perf/util/header.c | 8 ++++----
> > tools/perf/util/session.c | 7 ++++---
> > 5 files changed, 53 insertions(+), 11 deletions(-)
> >
> > diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> > index 452a75fe68e5..14d6c88fed76 100644
> > --- a/tools/perf/builtin-inject.c
> > +++ b/tools/perf/builtin-inject.c
> > @@ -853,10 +853,12 @@ int cmd_inject(int argc, const char **argv)
> > .output = {
> > .path = "-",
> > .mode = PERF_DATA_MODE_WRITE,
> > + .use_stdio = true,
> > },
> > };
> > struct perf_data data = {
> > .mode = PERF_DATA_MODE_READ,
> > + .use_stdio = true,
> > };
> > int ret;
> >
> > diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
> > index c47aa34fdc0a..47b5a4b50ca5 100644
> > --- a/tools/perf/util/data.c
> > +++ b/tools/perf/util/data.c
> > @@ -174,8 +174,16 @@ static bool check_pipe(struct perf_data *data)
> > is_pipe = true;
> > }
> >
> > - if (is_pipe)
> > - data->file.fd = fd;
> > + if (is_pipe) {
> > + if (data->use_stdio) {
> > + const char *mode;
> > +
> > + mode = perf_data__is_read(data) ? "r" : "w";
> > + data->file.fptr = fdopen(fd, mode);
>
> I guess fdopen should never fail right? but I think we should
> add BUG_ON(data->file.fptr == NULL) or something

The man page says it may fail when mode is invalid or malloc
failed internally. Will add the check.

>
> other than this the change looks good, I can see the speedup
> in bench as well

Thanks!
Namhyung