Re: [PATCH] checkpatch.pl: Add SPDX license tag check

From: Joe Perches
Date: Thu Nov 09 2017 - 13:27:49 EST


On Thu, 2017-11-09 at 12:12 -0600, Rob Herring wrote:
> On Thu, Nov 9, 2017 at 9:39 AM, Joe Perches <joe@xxxxxxxxxxx> wrote:
> > On Thu, 2017-11-09 at 07:47 -0600, Rob Herring wrote:
> > > On Wed, Nov 8, 2017 at 8:10 PM, Joe Perches <joe@xxxxxxxxxxx> wrote:
> > > > On Wed, 2017-11-08 at 19:10 -0600, Rob Herring wrote:
> > > > > Add a check warning if SPDX-License-Identifier tags are not used in
> > > > > newly added files.
> > > >
> > > > If this is to be done, and I think it's not a great idea,
> > >
> > > Which part? SPDX tags or checking new files or just using checkpatch for this?
> >
> > SPDX tags in all files.

Is having an SPDX tag in every file really desired?

> >
> > There's no real way to check a patch for this.
> >
> > You have to check the entire file.
>
> Changing existing files is a separate problem. There is a script for
> that (though the data file is not public). I'm only worried with new
> files here because that's what I review and have to tell folks to
> replace their 2 pages of license text with SPDX tags. (It will be much
> easier to just tell them to run checkpatch. ;) ).
>
> > checkpatch could, as you've done, scan for new files
> > against /dev/null, but a single patch can add
> > multiple files and each newly added file should have
> > a missing SPDX indicator check.
>
> I was going with the easy route of just giving one warning per patch.
> I'd hope that's enough info for folks to figure out what's needed from
> there. However, it should be possible to make it per file. The main
> complication is we need to look for either '^+++' or the end of the
> patch which I didn't see an easy/clean way to do.

EOF is easy.
There already is a $realfile test for start of file.

> > My concern is that there are ~50,000 files in the
> > kernel source tree and, after that scripted patch
> > adding the tags, only about a quarter of them have
> > an SPDX tag.
> >
> > So which files actually _need_ a SPDX tag?
> >
> > files in -next with an SPDX tag:
> >
> > $ git grep --name-only -i -P "spdx-licen[cs]e-identifier" | \
> > while read file ; do basename $file ; done | \
> > sed -r -e 's/^.*(\..*)/\1/' | \
> > sort | uniq -c | sort -rn | head -10
> > 7514 .h
> > 3435 .c
> > 1193 Makefile
> > 486 .S
> > 221 .dts
> > 186 Kconfig
> > 185 .dtsi
> > 97 .sh
> > 34 .tc
> > 24 .debug
> >
> > vs all files in -next (not Documentation/)
> >
> > $ git ls-files | grep -v "^Documentation/" | \
> > while read file ; do basename $file ; done | \
> > sed -r -e 's/^.*(\..*)/\1/' | \
> > sort | uniq -c | sort -rn | head -10
> > 25946 .c
> > 20360 .h
> > 2437 Makefile
> > 1454 .S
> > 1442 .dts
> > 1380 Kconfig
> > 1099 .dtsi
> > 207 .json
> > 204 .gitignore
> > 194 .sh
> >