Re: [PATCH v2] checkpatch: improve email parsing

From: Dwaipayan Ray
Date: Tue Nov 03 2020 - 03:03:14 EST


On Tue, Nov 3, 2020 at 12:58 PM Joe Perches <joe@xxxxxxxxxxx> wrote:
>
> On Tue, 2020-11-03 at 11:28 +0530, Dwaipayan Ray wrote:
> > On Tue, Nov 3, 2020 at 11:18 AM Dwaipayan Ray <dwaipayanray1@xxxxxxxxx> wrote:
> > >
> > > checkpatch doesn't report warnings for many common mistakes
> > > in emails. Some of which are trailing commas and incorrect
> > > use of email comments.
> > >
> > > At the same time several false positives are reported due to
> > > incorrect handling of mail comments. The most common of which
> > > is due to the pattern:
> > >
> > > <stable@xxxxxxxxxxxxxxx> # X.X
> > >
> > > Improve email parsing mechanism in checkpatch.
> > >
> > > What is added:
> > >
> > > - Support for multiple name/address comments.
> > > - Improved handling of quoted names.
> > > - Sanitize improperly formatted comments.
> > > - Sanitize trailing semicolon or dot after email.
> []
> > What do you think? Should warnings for the names which should
> > be quoted be reported considering this result?
>
> Clearly the quote suggestion is unnecessary.
>
> I think that "cc: stable@(?:vger\.)?kernel\.org" should be
> treated differently from other forms of invalid/odd address lines.
>
> My suggestion is that the case insensitive form of
>
> Cc: stable@xxxxxxxxxxxxxxx
>
> or only another similar case insensitive forms with a
> # comment separator like
>
> Cc: <stable@xxxxxxxxxxxxxxx> # some comment
>
> be acceptable for stable.
>
> All other forms with stable@ should emit some message.
>
> And other <foo>-by: and cc: addresses should only have a form like
>
> Signed-off-by: "Full.Name" (possible comment) <email@xxxxxxxxxx>
> or
> Signed-off-by: Full Name (possible comment) <email@xxxxxxxxxx>
>
> etc..
>
> and any additional content after .tld in the email address be flagged
> with some message like "unexpected content after email address" rather
> than "might be better as".
>
> What do you think best?
>

Hello,
I think the warning "unexpected content after email" is more reasonable.
But maybe we can allow addresses of type:

Full Name <email@xxxxxxxxxx> (comment here)
or
Full Name <email@xxxxxxxxxx> [another comment]

I checked the following:
$ git log --format=email -100000 | grep -P '^(?:[\w\-]+-by:|cc:|CC:|Cc:)' \
| grep -P ".*<\S+\@\S+>\s*(?:\(|\[|\/)" | wc -l
280

And for stable ones, I found another pattern which is common:
Cc: <stable@xxxxxxxxxxxxxxx> [X.X]
and only few instances of:
cc: stable@xxxxxxxxxxxxxxx (X.X)

Apart from these no other style has been used for stable so far.
Maybe placing it in the generic check would suffice.

So I think we can introduce the unexpected content warning for
anything other than the braces or c89 comment styles. And
maybe a separate check for stable won't be necessary then.

What is your thought on this?

Thanks,
Dwaipayan.