Re: [PATCH 1/2] scripts/spdxcheck.py: Always open files in binary mode

From: Jeremy Cline
Date: Wed Dec 12 2018 - 13:14:16 EST


Hi,

On Wed, Dec 12, 2018 at 02:12:09PM +0100, Thierry Reding wrote:
> From: Thierry Reding <treding@xxxxxxxxxx>
>
> The spdxcheck script currently falls over when confronted with a binary
> file (such as Documentation/logo.gif). To avoid that, always open files
> in binary mode and decode line-by-line, ignoring encoding errors.
>
> One tricky case is when piping data into the script and reading it from
> standard input. By default, standard input will be opened in text mode,
> so we need to reopen it in binary mode.
>
> Signed-off-by: Thierry Reding <treding@xxxxxxxxxx>
> ---
> scripts/spdxcheck.py | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/spdxcheck.py b/scripts/spdxcheck.py
> index 5056fb3b897d..e559c6294c39 100755
> --- a/scripts/spdxcheck.py
> +++ b/scripts/spdxcheck.py
> @@ -168,6 +168,7 @@ class id_parser(object):
> self.curline = 0
> try:
> for line in fd:
> + line = line.decode(locale.getpreferredencoding(False), errors='ignore')
> self.curline += 1
> if self.curline > maxlines:
> break
> @@ -249,12 +250,13 @@ if __name__ == '__main__':
>
> try:
> if len(args.path) and args.path[0] == '-':
> - parser.parse_lines(sys.stdin, args.maxlines, '-')
> + stdin = os.fdopen(sys.stdin.fileno(), 'rb')
> + parser.parse_lines(stdin, args.maxlines, '-')
> else:
> if args.path:
> for p in args.path:
> if os.path.isfile(p):
> - parser.parse_lines(open(p), args.maxlines, p)
> + parser.parse_lines(open(p, 'rb'), args.maxlines, p)
> elif os.path.isdir(p):
> scan_git_subtree(repo.head.reference.commit.tree, p)
> else:
> --
> 2.19.1
>

It might be worth noting this fixes commit 6f4d29df66ac
("scripts/spdxcheck.py: make python3 compliant") and also Cc this for
stable since 6f4d29df66ac got backported to v4.19. While that commit
did indeed make the script work with Python 3 for piping data, it broke
Python 2 and made its way to stable.

Reviewed-by: Jeremy Cline <jcline@xxxxxxxxxx>

Regards,
Jeremy