Re: [PATCH v2] checkpatch: improve email parsing

From: Lukas Bulwahn
Date: Wed Nov 04 2020 - 03:05:36 EST




On Tue, 3 Nov 2020, Joe Perches wrote:

> On Tue, 2020-11-03 at 09:10 +0100, Lukas Bulwahn wrote:
> > Maybe you can coordinate among each other who would want to create
> > suitable fix rules here?
>
> Yes please.
>
> > Also, start with the class of the most frequent mistakes for
> > unexpected content after email addresses.
> >
> > I imagine that a maintainer can simply run a tag sanitizing script
> > which just cleans up those stupid mistakes before creating their git
> > trees or sending git pulls to Linus.
>
> Does anyone really do that?
> It generally requires rebasing or post processing each commit after
> being committed before another commit occurs.
>

As far as I know from private converations, some maintainers do have
early testing branches, rebase those, squash commits and hence, they
might also sanitize the 'commit messages' if it works reliably; but I am
not a maintainer. I guess we implement something useful and then ask some
early-adopting maintainers to give it a spin and get some feedback.

> > Let us try to add these
> > sanitizing rules to checkpatch.pl with fix options for now; if that
> > sanitizing feature becomes a monster script of its own within
> > checkpatch.pl, we can refactor that into an independent script for
> > cleaning up.
>
> I rather doubt an independent script is going to be worthwhile
> as these rules shouldn't be all that complex.
>

Good to know. Okay, so let us add the rules and corresponding fix options
to checkpatch.pl.

> The only prefixes acceptable for a stable address should be
> CC:|Cc:|cc:. There are 2 uses in the last 100k commits for
> Signed-off-by: and Acked-by: with stable addresses, those should have a
> message/warning emitted in the future.
>
> The forms used with those cc: stable addresses:
>
> 2777 stable without comment
> 1381 stable # comment
> 74 stable [ comment ]
>
> So I suggest standardizing on no comment and # comment with any other
> style getting a warning.
>
> For non-stable <foo>-by: and cc: addresses and other signatures:
>
> Likely any content after a email address other than a parenthesized
> block should have some checkpatch message emitted.
>
> This should be OK:
>
> Signed-off-by: Full Name (comment) <address@xxxxxxxxxx> (maintainer:...)
>
> But perhaps this should not be OK:
>
> Signed-off-by: Full Name (comment) <address@xxxxxxxxxx> # comment
>
> There are 316 uses of this # comment style in the last 100k commits
> and 103 with (comment) after the address.
> Maybe the # use should be ok, maybe not.
>
> And anyone that uses a multiple comments in a name or a even
> a single comment in the email address should also get warned.
>
> The below should not be OK even if actually valid address forms:
>
> Signed-off-by: Full (comment1) Name (comment2) <address@xxxxxxxxxx>
> Signed-off-by: Full Name <address@(comment)domain.tld>
>
>

Thanks for your evaluation and hints.
I agree with them as well. Let us try to establish one common way from
comments.

Dwaipayan, if you code this into checkpatch.pl, maybe you can also add
some hints on conventions for tags in the kernel (process) documentation
to explain the rules and conventions we think make sense.

Lukas