Re: [PATCH V2] perf scripts python: Add a script to run instances of perf script in parallel

From: Adrian Hunter
Date: Tue Apr 23 2024 - 09:40:43 EST


On 11/04/24 21:19, Ian Rogers wrote:
> On Wed, Mar 13, 2024 at 5:36 AM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
>>
>> Add a Python script to run a perf script command multiple times in
>> parallel, using perf script options --cpu and --time so that each job
>> processes a different chunk of the data.
>>
>> The script supports the use of normal perf script options like
>> --dlfilter and --script, so that the benefit of running parallel jobs
>> naturally extends to them also. In addition, a command can be provided
>> (refer --pipe-to option) to pipe standard output to a custom command.
>>
>> Refer to the script's own help text at the end of the patch for more
>> details.
>>
>> The script is useful for Intel PT traces, that can be efficiently
>> decoded by perf script when split by CPU and/or time ranges. Running
>> jobs in parallel can decrease the overall decoding time.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@xxxxxxxxx>


>> +
>> + def __init__(self, cmd, pipe_to, output_dir="."):
>> + self.popen = None
>> + self.consumer = None
>> + self.cmd = cmd
>> + self.pipe_to = pipe_to
>> + self.output_dir = output_dir
>> + self.cmdout_name = output_dir + "/cmd.txt"
>> + self.stdout_name = output_dir + "/out.txt"
>> + self.stderr_name = output_dir + "/err.txt"
>
> Why use files here and not pipes?

There is an option to pipe to another command.

> Could using files cause the command
> to fail on a read-only file system?

The user chooses the output directory, so they will need the foresight
not to choose a read-only file system.