Re: Introduce Sashiko (agentic review of Linux kernel changes)

From: Chris Mason

Date: Fri Apr 03 2026 - 08:36:41 EST


On Fri, Apr 3, 2026 at 8:23 AM Lorenzo Stoakes (Oracle) <ljs@xxxxxxxxxx> wrote:
>
> On Fri, Apr 03, 2026 at 08:11:30AM -0400, Theodore Tso wrote:
> > One other thing to consider is copyright. This issue is one we can
> > safely ignore when we are asking LLM's to review code. But if ask
> > LLM's to generate documentation, and then we cut and paste the
> > generated text into kernel documentation, the copyright status of the
> > generated text is not well defined.
> >
> > In Europe, the European Comission has promulgated that LLM output,
> > having been generated by a machine, and not a human being, is not
> > copyrighted. If a human being then makes changes, the combined work
> > could be subject to copyright, and if it is merged into code that is
> > subject to the GPL (for example), the combined work would also be
> > subject to the original license. But that's only in Europe.
> >
> > But consider researchers were able to extract 96% of Harry Potter and
> > the Sourcerer's Stone from Claude 3.7 Sonnet. So with the right
> > prompt, if we get a paragraph that came from some published book about
> > Linux, and it was dropped into the Documentation/ directory, that
> > might be problematic, since even (or maybe especially) the European
> > Union might want to take a hard line. (Do you hear the people sing,
> > singing the songs of angry Victor Hugo's? :-)
> >
> > If we use an LLM model analyze docuemntation to identify gaps, and we
> > take a bullet list of missing functions or semantics, and the human
> > being writes new text from scratch, instead of cutting and pasting
> > directly from LLM, that should be safe. But of course, I'm not a
> > lawyer and I don't play one on TV.
>
> I don't think anybody's suggesting we use LLMs to generate documentation,
> at least that's not how I interpreted it?
>
> I'm very much against that, it absolutely requires expert input, and I've
> already personally rejected AI slop mm documentation submitted fairly
> recently.
>

I agree we need to very closely review any LLM generated content, but
the subsystem guides in the review prompts are mostly AI generated. I
personally would enjoy them a lot more if they also contained harry
potter exceprts, but we're not quite there yet.

Ex: https://github.com/masoncl/review-prompts/blob/main/kernel/subsystem/mm-vma.md

I'm sure as these get reviewed we'll find bugs, inaccuracies, and the
need to restructure, but it's not so widly wrong as to be useless
either.

-chris