Re: [RFC PATCH 1/1] kselftests: Add test to detect boot event slowdowns

From: Laura Nao
Date: Wed Jul 31 2024 - 05:25:37 EST


Hi Shuah,

On 7/25/24 17:50, Shuah Khan wrote:
> On 7/25/24 05:06, Laura Nao wrote:
>> Introduce a new kselftest to identify slowdowns in key boot events.
>> The test uses ftrace to track timings for specific boot events.
>> The kprobe_timestamps_to_yaml.py script can be run once to generate a
>> YAML file with the initial reference timestamps for these events.
>> The test_boot_time.py script can then be run on subsequent boots to
>> compare current timings against the reference values and check for
>> significant slowdowns over time.
>> The test ships with a bootconfig file for ftrace setup and a config
>> fragment for the necessary kernel configurations.
>>
>> Signed-off-by: Laura Nao <laura.nao@xxxxxxxxxxxxx>
>
> I am repeating the same comments I made on the cover letter here as
> well.
>
> What are the dependencies if any for this new test to work?
> Please do remember that tests in default run needs to have
> minimal dependencies so they can run on systems that have
> minimal support.
>

In order to run this test the kernel needs to be compiled with the
provided config fragment, which enables tracing and embeds the provided
bootconfig file in the kernel. Additionally, a YAML file with reference
timestamps must be supplied as input to the test.

> As mentioned earlier take a look at the tools/power/pm-graph
> bootgraph.py and sleepgraph.py to see if you can leverage
> them - bootgraph detects slowdowns during boot.
>
> We don't want to add duplicate scripts if the other one
> serves the needs. Those can be moved to selftests if it
> make sense.
>
> I will review this once we figure out if bootgraph serves
> the needs and I understand the dependencies for this test
> to work.
>

Thanks for the pointers!

It looks like sleepgraph.py is more focused on analyzing suspend/resume
timings, while bootgraph.py measures boot time using the kernel log and
ftrace. The latter might indeed come in handy.
As far as I can see, the script doesn't support automatic detection of
boot slowdowns, and the output is in HTML format, which is meant for
human analysis. However, I can look into adding support for a more
machine-readable output format too. The test proposed in this patch could
then use bootgraph.py to generate the reference file and measure current
boot timings.

I'll look into this and report back.

Thanks,

Laura