Re: [PATCH] docs: license-rules.txt: cover SPDX headers on Python scripts
From: Greg Kroah-Hartman
Date: Thu Sep 05 2019 - 05:27:09 EST
On Thu, Sep 05, 2019 at 06:23:13AM -0300, Mauro Carvalho Chehab wrote:
> The author of the license-rules.rst file wanted to be very restrict
> with regards to the location of the SPDX header. It says that
> the SPDX header "shall be added at the first possible line in
> a file which can contain a comment". Not happy with this already
> restrictive requiement, it goes further:
>
> "For the majority of files this is the first line, except for
> scripts", opening an exception to have the SPDX header at the
> second line, if the first line starts with "#!".
>
> Well, it turns that this is too restrictive for Python scripts,
> and may cause regressions if this would be enforced.
>
> As mentioned on:
> https://stackoverflow.com/questions/728891/correct-way-to-define-python-source-code-encoding
>
> Python's PEP-263 [1] dictates that an script that needs to default to
> UTF-8 encoding has to follow this rule:
>
> 'Python will default to ASCII as standard encoding if no other
> encoding hints are given.
>
> To define a source code encoding, a magic comment must be placed
> into the source files either as first or second line in the file'
>
> And:
> 'More precisely, the first or second line must match the following
> regular expression:
>
> ^[ \t\f]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)'
>
> [1] https://www.python.org/dev/peps/pep-0263/
>
> If a script has both "#!" and the charset encoding line, we can't place
> a SPDX tag without either violating license-rules.rst or breaking the
> script by making it crash with non-ASCII characters.
>
> So, add a sort notice saying that, for Python scripts, the SPDX
> header may be up to the third line, in order to cover the case
> where both "#!" and "# .*coding.*UTF-8" lines are found.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@xxxxxxxxxx>
> ---
> Documentation/process/license-rules.rst | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/process/license-rules.rst b/Documentation/process/license-rules.rst
> index 2ef44ada3f11..5d23e3498b1c 100644
> --- a/Documentation/process/license-rules.rst
> +++ b/Documentation/process/license-rules.rst
> @@ -64,9 +64,12 @@ License identifier syntax
> possible line in a file which can contain a comment. For the majority
> of files this is the first line, except for scripts which require the
> '#!PATH_TO_INTERPRETER' in the first line. For those scripts the SPDX
> - identifier goes into the second line.
> + identifier goes into the second line\ [1]_.
>
> -|
> +.. [1] Please notice that Python scripts may also need an encoding rule
> + as defined on PEP-263, which should be defined either at the first
> + or the second line. So, for such scripts, the SPDX identifier may
> + go up to the third line.
>
> 2. Style:
>
If you are going to do this, can you also fix up scripts/spdxcheck.py to
properly catch this, as well as fixing up the location of the spdx tag
line in the file itself?
thanks,
greg k-h