Re: AUTOSEL process

From: Eric Biggers
Date: Sat Mar 11 2023 - 12:48:24 EST


On Sat, Mar 11, 2023 at 11:16:44AM -0500, Theodore Ts'o wrote:
> On Sat, Mar 11, 2023 at 09:06:08AM -0500, Sasha Levin wrote:
> >
> > I suppose that if I had a way to know if a certain a commit is part of a
> > series, I could either take all of it or none of it, but I don't think I
> > have a way of doing that by looking at a commit in Linus' tree
> > (suggestions welcome, I'm happy to implement them).
>
> Well, this is why I think it is a good idea to have a link to the
> patch series in lore. I know Linus doesn't like it, claiming it
> doesn't add any value, but I have to disagree. It adds two bits of
> value.
>

So, earlier I was going to go into more detail about some of my ideas, before
Sasha and Greg started stonewalling with "patches welcome" (i.e. "I'm refusing
to do my job") and various silly arguments about why nothing should be changed.
But I suppose the worst thing that can happen is that that just continues, so
here it goes:

One of the first things I would do if I was maintaining the stable kernels is to
set up a way to automatically run searches on the mailing lists, and then take
advantage of that in the stable process in various ways. Not having that is the
root cause of a lot of the issues with the current process, IMO.

Now that lore exists, this might be trivial: it could be done just by hammering
lore.kernel.org with queries https://lore.kernel.org/linux-fsdevel/?q=query from
a Python script.

Of course, there's a chance that won't scale to multiple queries for each one of
thousands of stable commits, or at least won't be friendly to the kernel.org
admins. In that case, what can be done is to download down all emails from all
lists, using lore's git mirrors or Atom feeds, and index them locally. (Note:
if the complete history is inconveniently large, then just indexing the last
year or so would work nearly as well.)

Then once that is in place, that could be used in various ways. For example,
given a git commit, it's possible to search by email subject to get to the
original patch, *even if the git commit does not have a Link tag*. And it can
be automatically checked whether it's part of a patch series, and if so, whether
all the patches in the series are being backported or just some.

This could also be used to check for mentions of a commit on the mailing list
that potentially indicate a regression report, which is one of the issues we
discussed earlier. I'm not sure what the optimal search criteria would be, but
one option would be something like "messages that contain the commit title or
commit ID and are dated to after the commit being committed". There might need
to be some exclusions added to that.

This could also be used to automatically find the AUTOSEL email, if one exists,
and check whether it's been replied to or not.

The purpose of all these mailing list searches would be to generate a list of
potential issues with backporting each commit, which would then undergo brief
human review. Once issues are reviewed, that state would be persisted, so that
if the script gets run again, it would only show *new* information based on new
mailing list emails that have not already been reviewed. That's needed because
these issues need to be checked for when the patch is initially proposed for
stable as well as slightly later, before the actual release happens.

If the stable maintainers have no time for doing *any* human review themselves
(again, I do not know what their requirements are on how much time they can
spend per patch), then instead an email with the list of potential issues could
be generated and sent to stable@xxxxxxxxxxxxxxx for review by others.

Anyway, that's my idea. I know the response will be either "that won't work" or
"patches welcome", or a mix of both, but that's it.

- Eric