Re: Introduce Sashiko (agentic review of Linux kernel changes)

From: Roman Gushchin

Date: Thu Apr 02 2026 - 21:49:06 EST

Sean Christopherson <seanjc@xxxxxxxxxx> writes:

> +Venkatesh and Paolo
>
> On Thu, Mar 19, 2026, Roman Gushchin wrote:
>> "Lorenzo Stoakes (Oracle)" <ljs@xxxxxxxxxx> writes:
>> > On Wed, Mar 18, 2026 at 11:33:22AM -0700, Roman Gushchin wrote:
>> >> >> Finally, some subsystems have a good prompts coverage and some don't. It
>> >> >> doesn't have to be lengthy documentation (and it might actually be
>> >> >> counter-productive), but having a small list of things to look at - some
>> >> >> high-level concepts which are hard to grasp from the code, etc. - can
>> >> >> help a lot with both bug discovery and false positives.
>> >> >
>> >> > I guess best contributed to Chris's review-prompts repo right?
>> >>
>> >> Both works for me now, we'll figure out with Chris how to sync our
>> >> prompts. The small problem is that we're using various models, tools and
>> >> review protocols and barely can test each other's setup. And it's all
>> >> very fragile, so it's not exactly trivial.
>> >> But we'll figure out something soon.
>> > Yeah, part of the fun I guess :)
>> >
>> >> In general we need to carefully separate instructions (like which tools
>> >> to use, which prompts to load etc) from factual data. Then we can easily
>> >> use the factual data with various tooling around.
>
> In an offline conversation, Venkatesh had a very (IMO) insightful observation
> regarding the factual data of the prompts: the information is also very useful
> documentation for *humans*. And in response to me lamenting about having to
> potentially review an external repo, Venkatesh also suggested putting the gory
> details about subsystem behavior in the kernel's Documentation/.
>
> To me, that suggestion seems like a no brainer. The existing subject matter
> experts are already in place to review and help maintain the documentation, the
> documentation can be updated in lockstep with the code, those of us that like
> email-based review don't need to change our ways, etc. :-)
>
> And irrespective of AI domination, I'd love to have detailed documenation of some
> of KVM's gnarlier internals. If AI review is what gets us the staffing/motivation
> to write and maintain that documentation, then so be it. It would be a shame if
> some of the most comprehensive documentation for the kernel is buried in AI
> specific prompts.
>
> Naively, synchronizing from Documentation to model-specific bots doesn't seem
> like it'd be a hard problem to solve.

I think so too, thanks Sean!

First, I agree improving the documentation is a no-brainer with or
without AI. And AI will benefit from it too.

The only part which is slightly less obvious is what should go into
prompts vs what the model gets through training. I'm pretty sure that
the Linux kernel source code is used during the training, so at least
big frontier models "know" a lot about the kernel code already, so
prompts should really only close the gap between the cut date and new
developments plus all kinds of tribal knowledge which is not easy to
derive from the code or documentation (e.g. the outcome of some
hallway conversation during the last conference).

But I'm hopeful we can figure out a way to auto-generate prompts from
the documentation by stripping part which are "obvious" to the model.

Thanks!