Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings

From: Darren Hart
Date: Wed Mar 22 2017 - 11:25:56 EST


On Tue, Mar 21, 2017 at 11:31:08AM -0700, Joe Perches wrote:
> On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote:
> > Spamassassin sticks a long (~79 character) long string after a
> > line that has a single space in it. The line with space causes
> > checkpatch to erroniously think that it's in the content body, as
> > opposed to headers and thus flag a mail header as an unwrapped long
> > comment line.
>
> If the spammassassin header is like
>
> email-header-n: foo
> email-header-m: bar
>
> X-Spam-Report: bar

The specific content of the X-Spam-Report that triggers this for me,
from this patch for example, is:

=== 8< ===
X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary:
Content analysis details: (-1.9 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
X-TUID: alGBIuPZmqOj

=== >8 ===

The long ---- ----... line is over 75 characters and triggers the test
for long commit_log lines.

>
> Does that form follow rfc 5322?

By my reading, this is governed by the long header fields defined by
2.2.3, with whitespace folding defined as "a CRLF may be inserted before
any WSP."

>
> If it does then any email header could have that
> form and the header wrapping test should be

Yes, agreed.

So the logic we want is:

If we are in headers and we detect a CRLF and the next line starts with a WSP,
then we are still in headers (and therefor not in the commit log). The CRLF
information does not appear to be available as it is replaced with just \n.

> updated from
>
> if ($in_header_lines && $realfile =~ /^$/ &&
>     !($rawline =~ /^\s+\S/ ||
>       $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) {
> $in_header_lines = 0;
> $in_commit_log = 1;
> $has_commit_log = 1;
> }
>
> to something like
>
> if ($in_header_lines && $realfile =~ /^$/ &&
>     !($rawline =~ /^ (?:\s*\S|$)/ ||

Hrm... lines that start with maybe a space followed by a : ... Why did you
introduce that part of the check?

Looking at this more closely, I was also not clear why the original test looked
for several spaces followed by non-space. What case is this for?

> $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) {
>
>

Thanks,

--
Darren Hart
VMware Open Source Technology Center