Re: [PATCH] checkpatch.pl: Add SPDX license tag check

From: Rob Herring
Date: Thu Nov 09 2017 - 13:12:46 EST


On Thu, Nov 9, 2017 at 9:39 AM, Joe Perches <joe@xxxxxxxxxxx> wrote:
> On Thu, 2017-11-09 at 07:47 -0600, Rob Herring wrote:
>> On Wed, Nov 8, 2017 at 8:10 PM, Joe Perches <joe@xxxxxxxxxxx> wrote:
>> > On Wed, 2017-11-08 at 19:10 -0600, Rob Herring wrote:
>> > > Add a check warning if SPDX-License-Identifier tags are not used in
>> > > newly added files.
>> >
>> > If this is to be done, and I think it's not a great idea,
>>
>> Which part? SPDX tags or checking new files or just using checkpatch for this?
>
> SPDX tags in all files.
>
> There's no real way to check a patch for this.
>
> You have to check the entire file.

Changing existing files is a separate problem. There is a script for
that (though the data file is not public). I'm only worried with new
files here because that's what I review and have to tell folks to
replace their 2 pages of license text with SPDX tags. (It will be much
easier to just tell them to run checkpatch. ;) ).

> checkpatch could, as you've done, scan for new files
> against /dev/null, but a single patch can add
> multiple files and each newly added file should have
> a missing SPDX indicator check.

I was going with the easy route of just giving one warning per patch.
I'd hope that's enough info for folks to figure out what's needed from
there. However, it should be possible to make it per file. The main
complication is we need to look for either '^+++' or the end of the
patch which I didn't see an easy/clean way to do.

> My concern is that there are ~50,000 files in the
> kernel source tree and, after that scripted patch
> adding the tags, only about a quarter of them have
> an SPDX tag.
>
> So which files actually _need_ a SPDX tag?
>
> files in -next with an SPDX tag:
>
> $ git grep --name-only -i -P "spdx-licen[cs]e-identifier" | \
> while read file ; do basename $file ; done | \
> sed -r -e 's/^.*(\..*)/\1/' | \
> sort | uniq -c | sort -rn | head -10
> 7514 .h
> 3435 .c
> 1193 Makefile
> 486 .S
> 221 .dts
> 186 Kconfig
> 185 .dtsi
> 97 .sh
> 34 .tc
> 24 .debug
>
> vs all files in -next (not Documentation/)
>
> $ git ls-files | grep -v "^Documentation/" | \
> while read file ; do basename $file ; done | \
> sed -r -e 's/^.*(\..*)/\1/' | \
> sort | uniq -c | sort -rn | head -10
> 25946 .c
> 20360 .h
> 2437 Makefile
> 1454 .S
> 1442 .dts
> 1380 Kconfig
> 1099 .dtsi
> 207 .json
> 204 .gitignore
> 194 .sh
>