Re: [PATCH] perf data: export to JSON

From: Jiri Olsa
Date: Wed Mar 24 2021 - 14:21:32 EST


On Wed, Mar 24, 2021 at 09:06:50AM -0400, Nicholas Fraser wrote:
> This adds preliminary support to dump the contents of a perf.data file to
> human-readable JSON.
>
> The "perf data" command currently only supports exporting to Common Trace
> Format and it doesn't do symbol resolution among other things. Dumping to JSON
> means the data can be trivially parsed by anything without any dependencies
> (besides a JSON parser.) We use this to import the data into a tool on Windows
> where integrating perf or libbabeltrace is impractical.

hi,
exciting ;-) and curious, which tool is that?

>
> The JSON is encoded using some trivial fprintf() commands; there is no
> dependency on any JSON library. It currently only outputs samples. Other stuff
> like processes and mappings could easily be added as needed. The output is of
> course huge but it compresses well enough.

we already have zstd support compiled in for compressing samples,
should be easy to use it for compressing the output of this right
away

SNIP

> argc = parse_options(argc, argv, options,
> data_convert_usage, 0);
> if (argc) {
> @@ -84,11 +82,38 @@ static int cmd_data_convert(int argc, const char **argv)
> return -1;
> }
>
> + if (to_json && to_ctf) {
> + pr_err("You cannot specify both --to-ctf and --to-json.\n");
> + return -1;
> + }
> + if (!to_json && !to_ctf) {
> + pr_err("You must specify one of --to-ctf or --to-json.\n");
> + return -1;
> + }
> +

condition below should be under bt_convert__perf2json

> + if (to_json) {
> + if (opts.all) {
> + pr_err("--all is currently unsupported for JSON output.\n");
> + return -1;
> + }
> + if (opts.tod) {
> + pr_err("--tod is currently unsupported for JSON output.\n");
> + return -1;
> + }
> + if (opts.force) {
> + pr_err("--force is currently unsupported for JSON output.\n");
> + return -1;
> + }

I understand not supporting opts.all or opts.tod, but 'force'
support means just assigning 'force' to struct perf_data

> + return bt_convert__perf2json(input_name, to_json, &opts);
> + }
> +
> if (to_ctf) {
> #ifdef HAVE_LIBBABELTRACE_SUPPORT
> return bt_convert__perf2ctf(input_name, to_ctf, &opts);
> #else
> - pr_err("The libbabeltrace support is not compiled in.\n");
> + pr_err("The libbabeltrace support is not compiled in. perf should be "
> + "compiled with environment variables LIBBABELTRACE=1 and "
> + "LIBBABELTRACE_DIR=/path/to/libbabeltrace/\n");

please indent above 2 lines under the "The..." start

SNIP

> +static int process_sample_event(struct perf_tool *tool,
> + union perf_event *event __maybe_unused,
> + struct perf_sample *sample,
> + struct evsel *evsel __maybe_unused,
> + struct machine *machine)
> +{
> + struct convert_json *c = container_of(tool, struct convert_json, tool);
> + FILE *out = c->out;
> + struct addr_location al, tal;
> + u8 cpumode = PERF_RECORD_MISC_USER;
> +
> + if (machine__resolve(machine, &al, sample) < 0) {
> + return 0;

you should fail in here

> + }
> +
> + if (c->first) {
> + c->first = false;
> + } else {
> + fprintf(out, ",");
> + }

no need for curlies {} if there's just one line code under condition

SNIP

> + struct perf_data data = {
> + .mode = PERF_DATA_MODE_READ,
> + .path = input_name,
> + };
> +
> + c.out = fopen(output_name, "w");
> + if (!c.out) {
> + fprintf(stderr, "error opening output file!\n");
> + return -1;
> + }
> +
> + session = perf_session__new(&data, false, &c.tool);
> + if (IS_ERR(session)) {
> + fprintf(stderr, "error creating perf session!\n");
> + return -1;
> + }
> +
> + if (symbol__init(&session->header.env) < 0) {
> + fprintf(stderr, "symbol init error!\n");
> + return -1;
> + }
> +
> + // Version number for future-proofing. Most additions should be able to be
> + // done in a backwards-compatible way so this should only need to be bumped
> + // if some major breaking change must be made.
> + fprintf(c.out, "{\n\t\"linux-perf-json-version\": 1,");
> +
> + fprintf(c.out, "\n\t\"samples\": [");
> + perf_session__process_events(session);
> + fprintf(c.out, "\n\t]\n}\n");

you should close session with perf_session__delete

> +
> + return 0;
> +}
> diff --git a/tools/perf/util/data-convert-json.h b/tools/perf/util/data-convert-json.h
> new file mode 100644
> index 000000000000..1fcac5ce3ec1
> --- /dev/null
> +++ b/tools/perf/util/data-convert-json.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __DATA_CONVERT_JSON_H
> +#define __DATA_CONVERT_JSON_H
> +#include "data-convert.h"
> +
> +int bt_convert__perf2json(const char *input_name, const char *to_ctf,
> + struct perf_data_convert_opts *opts);
> +
> +#endif /* __DATA_CONVERT_JSON_H */

I don't remember why we added util/data-convert-bt.h, but it does not
make sense to me now.. I think both these declarations should be in
util/data-convert.h

it can be done as follow up on top of this change

thanks,
jirka


> diff --git a/tools/perf/util/data-convert.h b/tools/perf/util/data-convert.h
> index feab5f114e37..17c35eb6ab4f 100644
> --- a/tools/perf/util/data-convert.h
> +++ b/tools/perf/util/data-convert.h
> @@ -2,6 +2,8 @@
> #ifndef __DATA_CONVERT_H
> #define __DATA_CONVERT_H
>
> +#include <stdbool.h>
> +
> struct perf_data_convert_opts {
> bool force;
> bool all;
> --
> 2.31.0
>
>