[PATCH v5] checkpatch: add support to check 'Fixes:' tag format

From: Wang YanQing
Date: Sun May 03 2020 - 08:41:49 EST


According to submitting-patches.rst, 'Fixes:' tag has a little
stricter condition about the one line summary than normal git
commit description:
â...
Do not split the tag across multiple lines, tags are exempt from
the "wrap at 75 columns" rule in order to simplify parsing scripts
...â

And there is no sanity check for 'Fixes:' tag format in checkpatch
the same as GIT_COMMIT_ID for git commit description, so let's expand
the GIT_COMMIT_ID to add 'Fixes:' tag format check support.

The check supports below formats:
Fixes: 54a4f0239f2e ("KVM: MMU: make kvm_mmu_zap_page() return the number of pages it actually freed")
Fixes: 85f7cd3a2aad ("Revert "media: Kconfig: better support hybrid TV devices"")
Fixes: 878520ac45f9 ("ext4: save the error code which triggered...")
Fixes: 878520ac45f9 ("ext4: save the error code which triggered")
Fixes: 277f27e2f277 ("SUNRPC/cache: Allow garbage collection ... ")

The check doesn't support below formats and it will emit diagnostics info for them:
Fixes: f2c2e717642c ("usb: gadget: add raw-gadget interface"
Fixes: 6c73698904aa pinctrl: qcom: Introduce readl/writel accessors
Fixes: 3fd6e7d9a146 (ASoC: tas571x: New driver for TI TAS571x power amplifiers)
Fixes: 55697cbb44e4 ("arm64: dts: renesas: r8a779{65,80,90}: Add IPMMU devices nodes)
Fixes: ba35f8588f47 (âipvlan: Defer multicast / broadcast processing to a work-queueâ)
Fixes: cd758a9b57ee "KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler"
Fixes: 9b1640686470 ("scsi: lpfc: Fix use-after-free mailbox cmd completion")
Fixes: 03f6fc6de919 ('ASoC: rt5682: Add the soundwire support')
Fixes: 03404e8ae652("IB/mlx5: Add support to dropless RQ")

Because after GIT_COMMIT_ID supports 'Fixes:' tag format check, it could do
the same check as the UNKNOWN_COMMIT_ID, so we don't need UNKNOWN_COMMIT_ID
anymore and I decide to delete it.

Note: this patch also fixes double quotation mark issue for normal git
commit description, and now it supports double quotation mark in
title line, for example:
Commit e33e2241e272 ("Revert "cfg80211: Use 5MHz bandwidth by default
when checking usable channels"")

Note: this patch also adds diagnostics info support for normal git commit
description format check.

Based on original patch by Joe Perches <joe@xxxxxxxxxxx>

Link: https://lore.kernel.org/lkml/40bfc40958fca6e2cc9b86101153aa0715fac4f7.camel@xxxxxxxxxxx/
Signed-off-by: Wang YanQing <udknight@xxxxxxxxx>
---
Hi! Joe and all.

This is the v5 version, and I have tested it with below command:
$ git log -10000 --no-merges --grep="Fixes:" --format=%H | \
while read commit ; do \
./scripts/checkpatch.pl --git $commit --types=GIT_COMMIT_ID --quiet --nosummary; \
done

The result looks good enough already.

Thanks.

v5:
1: Rebased on '[PATCH v2] checkpatch: fix can't check for too long invalid commit id'.
2: Fix wrong diagnostics info for below non-standard 'Fixes:' formats.
a: Fixes: 03404e8ae652("IB/mlx5: Add support to dropless RQ")
b: Fixes: da97e18458fb: ("perf_event: Add support for LSM and SELinux checks")
3: Reword the diagnostics info for non-standard title format.

v4:
1: Add diagnostics info support, suggested by Joe Perches and Markus Elfring.
2: Delete UNKNOWN_COMMIT_ID and do the check in GIT_COMMIT_ID.

v3:
1: Fix a bug in short title line support.

v2:
1: Add support for double quotation mark in title line, suggested by Markus Elfring.
2: Add support for short title line with/without ellipsis.
3: Add supported format examples and unsupported format examples in changelog.
4: Fix a little wording issue in changelog , suggested by Markus Elfring.

scripts/checkpatch.pl | 120 +++++++++++++++++++++++++++++++++++++-------------
1 file changed, 90 insertions(+), 30 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9b47584..a1e4113 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2818,51 +2818,101 @@ sub process {
$line !~ /^\s*(?:Link|Patchwork|http|https|BugLink|base-commit):/i &&
$line !~ /^This reverts commit [0-9a-f]{7,40}/ &&
($line =~ /\bcommit\s+[0-9a-f]{5,}\b/i ||
+ $line =~ /\bfixes:\s+[0-9a-f]{5,}\b/i ||
($line =~ /(?:\s|^)[0-9a-f]{12,40}(?:[\s"'\(\[]|$)/i &&
- $line !~ /[\<\[][0-9a-f]{12,40}[\>\]]/i &&
- $line !~ /\bfixes:\s*[0-9a-f]{12,40}/i))) {
+ $line !~ /[\<\[][0-9a-f]{12,40}[\>\]]/i))) {
my $init_char = "c";
my $orig_commit = "";
+ my $prefix = "commit";
+ my $prefix_case = "[Cc]ommit";
my $short = 1;
my $long = 0;
my $case = 1;
my $space = 1;
+ my $space2 = 1;
my $hasdesc = 0;
- my $hasparens = 0;
+ my $has_parens_and_dqm = 0; # Double quotation mark
my $hasprefix = 1;
my $id = '0123456789ab';
my $orig_desc = "commit description";
my $description = "";
my $sha1_length_min = 12;
+ my $acrosslines = 0;
+ my $title = "title line";
+ my $desc_mismatch = 0;
+ my $diagnostics = "Diagnostics info:\n";

- if ($line =~ /\b(c)ommit\s+([0-9a-f]{5,})\b/i) {
+ if ($line =~ /\b(f)ixes:\s+([0-9a-f]{5,})\b/i) {
+ $init_char = $1;
+ $orig_commit = lc($2);
+ $prefix = "Fixes:";
+ $prefix_case = "Fixes:";
+ $init_char = "F";
+ $title = "a single line title (without line breaks but ellipsis is fine!)";
+ } elsif ($line =~ /\b(c)ommit\s+([0-9a-f]{5,})\b/i) {
$init_char = $1;
$orig_commit = lc($2);
} elsif ($line =~ /\b([0-9a-f]{12,40})\b/i) {
$orig_commit = lc($1);
$hasprefix = 0;
+ $diagnostics .= "Missing prefix \"$prefix_case\".\n";
+
+ }
+
+ if ($line =~ /\b$prefix\s+[0-9a-f]{41,}/i) {
+ $long = 1;
+ } elsif ($line =~ /\b$prefix\s+[0-9a-f]{$sha1_length_min,40}/i) {
+ $short = 0;
+ } elsif ($hasprefix) {
+ $diagnostics .= "Commit id $orig_commit is too short.\n";
+ }
+
+ if ($line =~ /\b$prefix [0-9a-f]/i) {
+ $space = 0;
+ } elsif ($hasprefix) {
+ $diagnostics .= "Extra whitespace between prefix \"${init_char}" . substr($prefix, 1) . "\" and $orig_commit.\n";
+ }
+
+ if ($line =~ /[0-9a-f]{5,} \S+/i || $line =~ /[0-9a-f]{5,}\s*$/i) {
+ $space2 = 0;
+ } elsif ($hasprefix) {
+ $diagnostics .= "We need a single whitespace and only a single whitespace between $orig_commit and title.\n";
}

- $short = 0 if ($line =~ /\bcommit\s+[0-9a-f]{$sha1_length_min,40}/i);
- $long = 1 if ($line =~ /\bcommit\s+[0-9a-f]{41,}/i);
- $space = 0 if ($line =~ /\bcommit [0-9a-f]/i);
- $case = 0 if ($line =~ /\b[Cc]ommit\s+[0-9a-f]{5,40}[^A-F]/);
- if ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)"\)/i) {
+ if ($line =~ /\b$prefix_case\s+[0-9a-f]{5,40}[^A-F]/) {
+ $case = 0;
+ } elsif ($hasprefix) {
+ $line =~ /(\b$prefix)\s+[0-9a-f]{5,40}[^A-F]/i;
+ $diagnostics .= "The prefix \"$1\" has case error.\n";
+ }
+
+ if ($line =~ /\b$prefix\s+[0-9a-f]{5,}\s+\("(.+)"\)/i) {
$orig_desc = $1;
- $hasparens = 1;
+ $has_parens_and_dqm = 1;
+ # Drop the ellipsis
+ if ($prefix eq "Fixes:" && $orig_desc =~ /(\s*\.{3}\s*$)/) {
+ $orig_desc = substr($orig_desc, 0, length($orig_desc) - length($1));
+ }
} elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s*$/i &&
defined $rawlines[$linenr] &&
- $rawlines[$linenr] =~ /^\s*\("([^"]+)"\)/) {
+ $rawlines[$linenr] =~ /^\s*\("(.+)"\)/) {
$orig_desc = $1;
- $hasparens = 1;
- } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("[^"]+$/i &&
+ $has_parens_and_dqm = 1;
+ } elsif ($line =~ /\b$prefix\s+[0-9a-f]{5,}\s+\(".+$/i &&
defined $rawlines[$linenr] &&
- $rawlines[$linenr] =~ /^\s*[^"]+"\)/) {
- $line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)$/i;
+ $rawlines[$linenr] =~ /^\s*.+"\)/) {
+ $line =~ /\b$prefix\s+[0-9a-f]{5,}\s+\("(.+)$/i;
$orig_desc = $1;
- $rawlines[$linenr] =~ /^\s*([^"]+)"\)/;
+ $rawlines[$linenr] =~ /^\s*(.+)"\)/;
$orig_desc .= " " . $1;
- $hasparens = 1;
+ $has_parens_and_dqm = 1;
+
+ if ($prefix eq "Fixes:") {
+ $acrosslines = 1;
+ $diagnostics .= "The title acrosses lines.\n";
+ }
+ } elsif ($hasprefix && !$space2) {
+ $diagnostics .= "No title in '(\"<$title>\")' format is found.\n";
}

($id, $description) = git_commit_info($orig_commit,
@@ -2879,10 +2929,31 @@ sub process {
}
}

+ if (defined($id) && $has_parens_and_dqm && ($orig_desc ne $description)) {
+ # Allow short description without too short!
+ if ($prefix eq "Fixes:") {
+ if (length($orig_desc) >= length($description)/2) {
+ my $desc = substr($description, 0, length($orig_desc));
+
+ if ($orig_desc ne $desc) {
+ $desc_mismatch = 1;
+ $diagnostics .= "The title doesn't match the original commit.\n";
+ }
+ } else {
+ $desc_mismatch = 1;
+ $diagnostics .= "The title is too abbreviated, at least half of orignial commit title is necessary.\n";
+ }
+ } else {
+ $desc_mismatch = 1;
+ $diagnostics .= "The title doesn't match the original commit.\n";
+ }
+ }
+
if (defined($id) &&
- ($short || $space || $case || ($orig_desc ne $description) || !$hasparens)) {
+ ($short || $space || $space2 || $case || $desc_mismatch || !$hasprefix || !$has_parens_and_dqm || $acrosslines)) {
ERROR("GIT_COMMIT_ID",
- "Please use git commit description style 'commit <$sha1_length_min+ chars of sha1> (\"<title line>\")' - ie: '${init_char}ommit $id (\"$description\")'\n" . $herecurr);
+ "Please use git commit description style '$prefix <$sha1_length_min+ chars of sha1> (\"<$title>\")' - ie: '${init_char}" . substr($prefix, 1) .
+ " $id (\"$description\")'\n" . $diagnostics . $herecurr);
}
}

@@ -2982,17 +3053,6 @@ sub process {
}
}

-# check for invalid commit id
- if ($in_commit_log && $line =~ /^fixes:\s+([0-9a-f]{6,40})\b/i) {
- my $id;
- my $description;
- ($id, $description) = git_commit_info($1, undef, undef);
- if (!defined($id)) {
- WARN("UNKNOWN_COMMIT_ID",
- "Unknown commit id '$1', maybe rebased or not pulled?\n" . $herecurr);
- }
- }
-
# ignore non-hunk lines and lines being removed
next if (!$hunk_line || $line =~ /^-/);

--
1.8.5.6.2.g3d8a54e.dirty