Kernel regression tracking/reporting initiatives and KCIDB

From: Ricardo Cañuelo
Date: Tue Aug 01 2023 - 07:47:44 EST


Hi all,

I'm Ricardo from Collabora. In the past months, we’ve been analyzing the
current status of CI regression reporting and tracking in the Linux
kernel: assessing the existing tools, testing their functionalities,
collecting ideas about desirable features that aren’t available yet and
sketching some of them.

As part of this effort, we wrote a Regression Tracker tool [1] as a
proof of concept. It’s a rather simple tool that takes existing
regression data and reports and uses them to show more context on each
reported regression, as well as highlighting the relationships between
them, whether they can be caused by an infrastructure error and other
additional metadata about their current status. We’ve been using it
mostly as a playground for us to explore the current status of the
functionalities provided by CI systems and to test ideas about new
features.

We’re also checking other tools and services provided by the community,
such as regzbot [2], collaborating with them when possible and thinking
about how to combine multiple scattered efforts by different people
towards the same common goal. As a first step, we’ve contributed to
regzbot and partially integrated its results into the Regression Tracker
tool.

So far, we’ve been using the KernelCI regression data and reports as a
data source, we're now wondering if we could tackle the problem with a
more general approach by building on top of what KCIDB already provides.

In general, CI systems tend to define regressions as a low-level concept
which is rather static: a snapshot of a test result at a certain point
in time. When it comes to reporting them to developers, there's much
more info that could be added to them. In particular, the context of it
and the fact that a reported regression has a life cycle:

- did this test also fail on other hardware targets or with other kernel
configurations?
- is it possible that the test failed because of an infrastructure
error?
- does the test fail consistently since that commit or does it show
unstable results?
- does the test output show any traces of already known bugs?
- has this regression been bisected and reported anywhere?
- was the regression reported by anyone? If so, is there someone already
working on it?

Many of these info points can be extracted from the CI results databases
and processed to provide additional regression data. That’s what we’re
trying to do with the Regression Tracker tool, and we think it’d be
interesting to start experimenting with the data in KCIDB to see how
this could be improved and what would be the right way to integrate this
type of functionality.

Please let us know if that's a possibility and if you'd like to add
anything to the ideas proposed above.

Cheers,
Ricardo

[1] https://kernel.pages.collabora.com/kernelci-regressions-tracker/
[2] https://linux-regtracking.leemhuis.info/regzbot/all/