[PATCH RFC] checkpatch: add new cases to commit handling

From: Dwaipayan Ray
Date: Fri Nov 13 2020 - 07:31:37 EST


Commit extraction in checkpatch fails in some cases.
One of the most common false positives is a split line
between "commit" and the git SHA of the commit.

Improve commit handling to reduce false positives.

Improvements:
- handle split line between commit and git SHA of commit.
- fix handling of split commit description.

A quick evaluation of 50k commits from v5.4 showed that
the GIT_COMMIT_ID errors dropped from 1032 to 897. Most
of these were split lines between commit and its hash.

Signed-off-by: Dwaipayan Ray <dwaipayanray1@xxxxxxxxx>
---
scripts/checkpatch.pl | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 024514946bed..f5ba2beac008 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2990,6 +2990,16 @@ sub process {
if ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)"\)/i) {
$orig_desc = $1;
$hasparens = 1;
+ } elsif ($line =~ /^\s*[0-9a-f]{5,}\s+\("([^"]+)"\)/i &&
+ defined $rawlines[$linenr-2] &&
+ $rawlines[$linenr-2] =~ /\bcommit\s*$/i) {
+ $line =~ /^\s*[0-9a-f]{5,}\s+\("([^"]+)"\)/i;
+ $orig_desc = $1;
+ $hasparens = 1;
+ $space = 0;
+ $short = 0 if ($line =~ /\b[0-9a-f]{12,40}/i);
+ $long = 1 if ($line =~ /\b[0-9a-f]{41,}/i);
+ $case = 0 if ($line =~ /\b[0-9a-f]{5,40}[^A-F]/ && $rawlines[$linenr-2] =~ /\b[Cc]ommit\s*$/);
} elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s*$/i &&
defined $rawlines[$linenr] &&
$rawlines[$linenr] =~ /^\s*\("([^"]+)"\)/) {
@@ -3001,7 +3011,9 @@ sub process {
$line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)$/i;
$orig_desc = $1;
$rawlines[$linenr] =~ /^\s*([^"]+)"\)/;
- $orig_desc .= " " . $1;
+ my $split_desc = $1;
+ $split_desc = " $split_desc" if ($line =~ /[\w\,\.]$/);
+ $orig_desc .= $split_desc;
$hasparens = 1;
}

--
2.27.0