Re: LLM based rewrites
From: H. Peter Anvin
Date: Mon Mar 09 2026 - 13:01:30 EST
On March 9, 2026 9:33:12 AM PDT, Jonathan Corbet <corbet@xxxxxxx> wrote:
>Steven Rostedt <rostedt@xxxxxxxxxxx> writes:
>
>> On Mon, 09 Mar 2026 08:31:03 -0700
>> "H. Peter Anvin" <hpa@xxxxxxxxx> wrote:
>>
>>> It is somewhat hard to see how that would constitute a "clean-room"
>>> rewrite. A clean-room rewrite entails two teams, one (the "clean" room)
>>> which must be certified to have never seen the code in question, and all
>>> communications between the two teams must be auditable.
>>
>> I was thinking the same.
>
>The argumentation that is being made (which I am trying to reproduce but
>am *not* advocating) is that "a clean-room rewrite is just one means to
>an end" and that, in this specific case, the code being rewritten was
>explicitly excluded from the context given to the bot (though that turns
>out not to entirely be the case). In theory, it only had the desired
>API and a set of tests available to it.
>
>The fact that every version of chardet was surely in its training data
>is not deemed to be relevant.
>
>jon
>
That's a question for the lawyers and the courts, really. But it is most definitely *not* clean room. That being said, clean room is certainly not the only way to rewrite software that can pass legal muster, but it is the gold standard