Re: [PATCH] checkpatch: use patch subject when reading from stdin

From: Geert Uytterhoeven
Date: Tue Oct 15 2019 - 02:49:43 EST


Hi Joe,

On Mon, Oct 14, 2019 at 4:48 PM Joe Perches <joe@xxxxxxxxxxx> wrote:
> On Mon, 2019-10-14 at 11:20 +0200, Geert Uytterhoeven wrote:
> > On Tue, Oct 8, 2019 at 8:10 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > On Tue, Oct 8, 2019 at 7:02 PM Joe Perches <joe@xxxxxxxxxxx> wrote:
> > > > On Tue, 2019-10-08 at 17:28 +0200, Geert Uytterhoeven wrote:
> > > > > On Tue, Oct 8, 2019 at 5:20 PM Joe Perches <joe@xxxxxxxxxxx> wrote:
> > > > > > On Tue, 2019-10-08 at 11:40 +0200, Geert Uytterhoeven wrote:
> > > > > > > When reading a patch file from standard input, checkpatch calls it "Your
> > > > > > > patch", and reports its state as:
> > > > > > >
> > > > > > > Your patch has style problems, please review.
> > > > > > >
> > > > > > > or:
> > > > > > >
> > > > > > > Your patch has no obvious style problems and is ready for submission.
> > > > > > >
> > > > > > > Hence when checking multiple patches by piping them to checkpatch, e.g.
> > > > > > > when checking patchwork bundles using:
> > > > > > >
> > > > > > > formail -s scripts/checkpatch.pl < bundle-foo.mbox
> > > > > > >
> > > > > > > it is difficult to identify which patches need to be reviewed and
> > > > > > > improved.
> > > > > > >
> > > > > > > Fix this by replacing "Your patch" by the patch subject, if present.
> > > > > > []
> > > > > > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > > > > > []
> > > > > > > @@ -1047,6 +1047,10 @@ for my $filename (@ARGV) {
> > > > > > > }
> > > > > > > while (<$FILE>) {
> > > > > > > chomp;
> > > > > > > + if ($vname eq 'Your patch') {
> > > > > > > + my ($subject) = $_ =~ /^Subject:\s*(.*)/;
> > > > > > > + $vname = '"' . $subject . '"' if $subject;
> > > > > >
> > > > > > Hi again Geert.
> > > > > >
> > > > > > Just some stylistic nits:
> > > > > >
> > > > > > $filename is not quoted so I think adding quotes
> > > > > > before and after $subject may not be useful.
> > > > >
> > > > > Filename is indeed not quoted, but $git_commits{$filename} is.
> > > >
> > > > If I understand your use case, this will only show the last
> > > > patch $subject of a bundle?
> > >
> > > False.
> > > "formail -s scripts/checkpatch.pl < bundle-foo.mbox" splits
> > > "bundle-foo.mbox" in separate patches, and invokes
> > > "scripts/checkpatch.pl" for each of them.
> > >
> > > > Also, it'll show things like "duplicate signature" when multiple
> > > > patches are tested in a single bundle.
> > >
> > > False, due to the splitting by formail.
> > >
> > > > For instance, if I have a git format-patch series in an output
> > > > directory and do
> > > >
> > > > $ cat <output_dir>/*.patch | ./scripts/checkpatch.pl
> > > >
> > > > Bad output happen.
> > >
> > > Yeah, because you're concatenating all patches.
> > > Currently it works for single patches only.
> > >
> > > > Maybe this might be better:
> > > > --- a/scripts/checkpatch.pl
> > > > +++ b/scripts/checkpatch.pl
> > > > @@ -2444,6 +2444,15 @@ sub process {
> > > >
> > > > my $rawline = $rawlines[$linenr - 1];
> > > >
> > > > +# if input from stdin, report the subject lines if they exist
> > > > + if ($filename eq '-' && !$quiet &&
> > > > + $rawline =~ /^Subject:\s*(.*)/) {
> > > > + report("stdin", "STDIN", '-' x length($1));
> > > > + report("stdin", "STDIN", $1);
> > > > + report("stdin", "STDIN", '-' x length($1));
> > > > + %signatures = (); # avoid duplicate signatures
> > > > + }
> > > > +
> > > > # check if it's a mode change, rename or start of a patch
> > > > if (!$in_commit_log &&
> > > > ($line =~ /^ mode change [0-7]+ => [0-7]+ \S+\s*$/ ||
> > >
> > > Perhaps. Just passing the patchwork bundle to checkpatch, and fixing
> > > checkpatch to handle multiple patches in a single file was my first idea.
> > > But it looked fragile, with too much state that needs to be reset.
> > > I.e. the state is not limited to %signatures. You also have to reset
> > > $author inside process(), and probably a dozen other variables.
> > > And make sure that future changes don't forget resetting all newly
> > > introduced variables.
> > >
> > > Hence I settled for the solution using formail.
> >
> > I gave your solution a try.
> > It only enables the reset-on-next-patch feature when using stdin.
> > Thanks to the printed subject, it's now obvious to which patch a
> > message applies to.
> > However, the output is significantly different than when passing
> > a split patch series. Can this be improved upon?
> >
> > Note that the only reason I'm using stdin is that I use formail to split
> > a bundle in individual patches. Once checkpatch supports bundles (or
> > mboxes) containing multiple patches, there's no longer a need for
> > using formail, and the reset-on-next-patch feature should be
> > enabled unconditionally.
>
> Using your collection of little tools idea,
> why not write a trivial script like:
>
> grep "^Subject:" $1
> checkpatch.pl $1
>
> and use that as the command line for formail
> instead of adding unnecessary complexity to
> checkpatch?

That would be another possibility.

But given more maintainers are starting to apply patchwork bundles (cfr.
the workflows discussions), it makes sense to make their lives easier.

This is also useful for maintainers who save all patches to apply into a
single mbox, and run checkpatch+git-am on that.

Summarized: git-am handles multiple patches, checkpatch requires
splitting.

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds