spdx spring cleaning

From: Rasmus Villemoes
Date: Fri Feb 26 2021 - 07:32:51 EST


Hi,

I was doing some 'git grep SPDX-License-Identifier' statistics, but
noticed that I had to do a lot more normalization than expected (clearly
handling different comment markers is needed).

How about running something like the below after -rc1? The end result is

2558 files changed, 2558 insertions(+), 2558 deletions(-)

mostly from the last fixup, before that it's merely

90 files changed, 90 insertions(+), 90 deletions(-)

Rasmus

#!/bin/sh

fixup() {
gp="$1"
cmd="$2"

git grep --files-with-matches "SPDX-License-Identifier:$gp" | grep
-v COPYING | \
xargs -r -P8 sed -E -s -i -e "1,3 { /SPDX-License-Identifier/ {
$cmd } }"
git diff --stat | tail -n1
}

# tab->space, the first string is "dot asterisk tab"
fixup '.* ' 's/\t/ /g'

# trailing space
fixup '.* $' 's/ *$//'

# collapse multiple spaces
fixup '.* ' 's/ */ /g'

# or -> OR
fixup '.* or ' 's/ or / OR /g'

# Remove outer parenthesis - when that pair is the only set of
# parenthesis. Only none or */ trailing comment marker is handled.
fixup ' (' 's|Identifier: \(([^()]*)\)( \*/)?$|Identifier: \1\2|'