Re: [PATCH v5] checkpatch: add new warnings to author signoff checks.

From: Dwaipayan Ray
Date: Wed Oct 07 2020 - 02:39:07 EST


On Wed, Oct 7, 2020 at 12:03 PM Dwaipayan Ray <dwaipayanray1@xxxxxxxxx> wrote:
>
> The author signed-off-by checks are currently very vague.
> Cases like same name or same address are not handled separately.
>
> For example, running checkpatch on commit be6577af0cef
> ("parisc: Add atomic64_set_release() define to avoid CPU soft lockups"),
> gives:
>
> WARNING: Missing Signed-off-by: line by nominal patch author
> 'John David Anglin <dave.anglin@xxxxxxxx>'
>
> The signoff line was:
> "Signed-off-by: Dave Anglin <dave.anglin@xxxxxxxx>"
>
> Clearly the author has signed off but with a slightly different version
> of his name. A more appropriate warning would have been to point out
> at the name mismatch instead.
>
> Previously, the values assumed by $authorsignoff were either 0 or 1
> to indicate whether a proper sign off by author is present.
> Extended the checks to handle four new cases.
>
> $authorsignoff values now denote the following:
>
> 0: Missing sign off by patch author.
>
> 1: Sign off present and identical.
>
> 2: Addresses and names match, but comments differ.
> "James Watson(JW) <james@xxxxxxxxx>", "James Watson <james@xxxxxxxxx>"
>
> 3: Addresses match, but names are different.
> "James Watson <james@xxxxxxxxx>", "James <james@xxxxxxxxx>"
>
> 4: Names match, but addresses are different.
> "James Watson <james@xxxxxxxxxx>", "James Watson <james@xxxxxxxxx>"
>
> 5: Names match, addresses excluding subaddress details (RFC 5233) match.
> "James Watson <james@xxxxxxxxx>", "James Watson <james+a@xxxxxxxxx>"
>
> Suggested-by: Joe Perches <joe@xxxxxxxxxxx>
> Signed-off-by: Dwaipayan Ray <dwaipayanray1@xxxxxxxxx>
> ---
> scripts/checkpatch.pl | 93 +++++++++++++++++++++++++++++++++++--------
> 1 file changed, 77 insertions(+), 16 deletions(-)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 31624bbb342e..6e0a0d4603d0 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -1163,10 +1163,10 @@ sub parse_email {
> }
> }
>
> + $comment = trim($comment);
> $name = trim($name);
> $name =~ s/^\"|\"$//g;
> - $name =~ s/(\s*\([^\)]+\))\s*//;
> - if (defined($1)) {
> + if ($name =~ s/(\s*\([^\)]+\))\s*//) {
> $name_comment = trim($1);
> }
> $address = trim($address);
> @@ -1181,10 +1181,12 @@ sub parse_email {
> }
>
> sub format_email {
> - my ($name, $address) = @_;
> + my ($name, $name_comment, $address, $comment) = @_;
>
> my $formatted_email;
>
> + $name_comment = trim($name_comment);
> + $comment = trim($comment);
> $name = trim($name);
> $name =~ s/^\"|\"$//g;
> $address = trim($address);
> @@ -1197,9 +1199,9 @@ sub format_email {
> if ("$name" eq "") {
> $formatted_email = "$address";
> } else {
> - $formatted_email = "$name <$address>";
> + $formatted_email = "$name$name_comment <$address>";
> }
> -
> + $formatted_email .= "$comment";
> return $formatted_email;
> }
>
> @@ -1207,17 +1209,23 @@ sub reformat_email {
> my ($email) = @_;
>
> my ($email_name, $name_comment, $email_address, $comment) = parse_email($email);
> - return format_email($email_name, $email_address);
> + return format_email($email_name, $name_comment, $email_address, $comment);
> }
>
> sub same_email_addresses {
> - my ($email1, $email2) = @_;
> + my ($email1, $email2, $match_comment) = @_;
>
> my ($email1_name, $name1_comment, $email1_address, $comment1) = parse_email($email1);
> my ($email2_name, $name2_comment, $email2_address, $comment2) = parse_email($email2);
>
> + if ($match_comment != 1) {
> + return $email1_name eq $email2_name &&
> + $email1_address eq $email2_address;
> + }
> return $email1_name eq $email2_name &&
> - $email1_address eq $email2_address;
> + $email1_address eq $email2_address &&
> + $name1_comment eq $name2_comment &&
> + $comment1 eq $comment2;
> }
>
> sub which {
> @@ -2347,6 +2355,7 @@ sub process {
> my $signoff = 0;
> my $author = '';
> my $authorsignoff = 0;
> + my $author_sob = '';
> my $is_patch = 0;
> my $is_binding_patch = -1;
> my $in_header_lines = $file ? 0 : 1;
> @@ -2674,9 +2683,37 @@ sub process {
> if ($line =~ /^\s*signed-off-by:\s*(.*)/i) {
> $signoff++;
> $in_commit_log = 0;
> - if ($author ne '') {
> - if (same_email_addresses($1, $author)) {
> + if ($author ne '' && $authorsignoff != 1) {
> + if (same_email_addresses($1, $author, 1)) {
> $authorsignoff = 1;
> + } else {
> + my $ctx = $1;
> + my ($email_name, $email_comment, $email_address, $comment1) = parse_email($ctx);
> + my ($author_name, $author_comment, $author_address, $comment2) = parse_email($author);
> +
> + if ($email_address eq $author_address && $email_name eq $author_name) {
> + $author_sob = $ctx;
> + $authorsignoff = 2;
> + } elsif ($email_address eq $author_address) {
> + $author_sob = $ctx;
> + $authorsignoff = 3;
> + } elsif ($email_name eq $author_name) {
> + $author_sob = $ctx;
> + $authorsignoff = 4;
> +
> + my $address1 = $email_address;
> + my $address2 = $author_address;
> +
> + if ($address1 =~ /(\S+)\+\S+(\@.*)/) {
> + $address1 = "$1$2";
> + }
> + if ($address2 =~ /(\S+)\+\S+(\@.*)/) {
> + $address2 = "$1$2";
> + }
> + if ($address1 eq $address2) {
> + $authorsignoff = 5;
> + }
> + }
> }
> }
> }
> @@ -2733,7 +2770,7 @@ sub process {
> }
>
> my ($email_name, $name_comment, $email_address, $comment) = parse_email($email);
> - my $suggested_email = format_email(($email_name, $email_address));
> + my $suggested_email = format_email(($email_name, $name_comment, $email_address, $comment));
> if ($suggested_email eq "") {
> ERROR("BAD_SIGN_OFF",
> "Unrecognized email address: '$email'\n" . $herecurr);
> @@ -2743,9 +2780,9 @@ sub process {
> $dequoted =~ s/" </ </;
> # Don't force email to have quotes
> # Allow just an angle bracketed address
> - if (!same_email_addresses($email, $suggested_email)) {
> + if (!same_email_addresses($email, $suggested_email, 0)) {
> WARN("BAD_SIGN_OFF",
> - "email address '$email' might be better as '$suggested_email$comment'\n" . $herecurr);
> + "email address '$email' might be better as '$suggested_email'\n" . $herecurr);
> }
> }
>
> @@ -6891,9 +6928,33 @@ sub process {
> if ($signoff == 0) {
> ERROR("MISSING_SIGN_OFF",
> "Missing Signed-off-by: line(s)\n");
> - } elsif (!$authorsignoff) {
> - WARN("NO_AUTHOR_SIGN_OFF",
> - "Missing Signed-off-by: line by nominal patch author '$author'\n");
> + } elsif ($authorsignoff != 1) {
> + # authorsignoff values:
> + # 0 -> missing sign off
> + # 1 -> sign off identical
> + # 2 -> names and addresses match, comments mismatch
> + # 3 -> addresses match, names different
> + # 4 -> names match, addresses different
> + # 5 -> names match, addresses excluding subaddress details (refer RFC 5233) match
> +
> + my $sob_msg = "'From: $author' != 'Signed-off-by: $author_sob'";
> +
> + if ($authorsignoff == 0) {
> + ERROR("NO_AUTHOR_SIGN_OFF",
> + "Missing Signed-off-by: line by nominal patch author '$author'\n");
> + } elsif ($authorsignoff == 2) {
> + CHK("NO_AUTHOR_SIGN_OFF",
> + "From:/Signed-off-by: email comments mismatch: $sob_msg\n");
> + } elsif ($authorsignoff == 3) {
> + WARN("NO_AUTHOR_SIGN_OFF",
> + "From:/Signed-off-by: email name mismatch: $sob_msg\n");
> + } elsif ($authorsignoff == 4) {
> + WARN("NO_AUTHOR_SIGN_OFF",
> + "From:/Signed-off-by: email address mismatch: $sob_msg\n");
> + } elsif ($authorsignoff == 5) {
> + WARN("NO_AUTHOR_SIGN_OFF",
> + "From:/Signed-off-by: email subaddress mismatch: $sob_msg\n");
> + }
> }
> }
>
> --
> 2.27.0
>

Hi,
I needed to change a bit in a few subroutines which trimmed off
the comments. Comments now appear in the formatted email.
The usage is also refactored so that it doesn't cause any undesired
effects.

Coming back to the signoff check, now email comments are also
compared.

Cases like:
"Dwaipayan Ray(Dwai) <dwaipayanray1@xxxxxxxxx>" and
"Dwaipayan Ray <dwaipayanray1@xxxxxxxxx>", now generate
a --strict CHK.

Thanks,
Dwaipayan.