Re: [RFC PATCH 1/1] kselftests: Add test to detect boot event slowdowns

From: Laura Nao
Date: Thu Aug 08 2024 - 06:45:40 EST


On 7/31/24 11:25, Laura Nao wrote:
>
> It looks like sleepgraph.py is more focused on analyzing suspend/resume
> timings, while bootgraph.py measures boot time using the kernel log and
> ftrace. The latter might indeed come in handy.
> As far as I can see, the script doesn't support automatic detection of
> boot slowdowns, and the output is in HTML format, which is meant for
> human analysis. However, I can look into adding support for a more
> machine-readable output format too. The test proposed in this patch could
> then use bootgraph.py to generate the reference file and measure current
> boot timings.
>
> I'll look into this and report back.
>

After examining the bootgraph.py script, it seems feasible to add
support for generating the output in a machine-readable format
(e.g., JSON) for automated analysis. Todd, I've CC'd you on this
discussion in case you have feedback on possibly using bootgraph.py in
an automated test to detect slowdowns.

Some points to consider:

- The bootgraph.py script supports ftrace through the -fstat and -ftrace
options, and it parses the kernel log to get initcall timings. To use
this in an automated test, we need a way to provide the necessary
command line options. One approach is to include these options in a
bootconfig file embedded in the kernel image (as per proposal in this
RFC). Shuah, do you think this is acceptable? I haven't seen other
tests doing this, so I'm unsure if this is a proper way to handle
required command line options in a selftest.

- The bootgraph.py script tracks timings for all init calls, which might
be excessive and generate too much output when integrated in an
automated test. We might need to limit the test output to report only
significant slowdowns to make it manageable.

- I'd like to get some feedback on which key boot process events are
more relevant to track; depending on this, we could use the
bootgraph.py script to monitor initcalls and possibly other events
tracked via ftrace. The script currently uses the function_graph
tracer, and its parser is designed for this tracer's output. If we need
to track other events (e.g., kprobe events), the parser might need some
adjustments.

I'll be discussing this at LPC in September
(https://lpc.events/event/18/contributions/1700/) and look forward to
exploring more details and alternative approaches for an automated boot
time test.

Best,

Laura