[PATCH] kunit: tool: continue past invalid utf-8 output

From: Daniel Latypov
Date: Fri Oct 08 2021 - 17:08:05 EST


kunit.py currently crashes and fails to parse kernel output if it's not
fully valid utf-8.

This can come from memory corruption or or just inadvertently printing
out binary data as strings.

E.g. adding this line into a kunit test
pr_info("\x80")
will cause this exception
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 1961: invalid start byte

We can tell Python how to handle errors, see
https://docs.python.org/3/library/codecs.html#error-handlers

Unfortunately, it doesn't seem like there's a way to specify this in
just one location, so we need to repeat ourselves quite a bit.

Specify `errors='backslashreplace'` so we instead:
* print out the offending byte as '\x80'
* try and continue parsing the output.
* as long as the TAP lines themselves are valid, we're fine.

Signed-off-by: Daniel Latypov <dlatypov@xxxxxxxxxx>
---
tools/testing/kunit/kunit.py | 3 ++-
tools/testing/kunit/kunit_kernel.py | 4 ++--
2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
index 9c9ed4071e9e..28ae096d4b53 100755
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -457,9 +457,10 @@ def main(argv, linux=None):
sys.exit(1)
elif cli_args.subcommand == 'parse':
if cli_args.file == None:
+ sys.stdin.reconfigure(errors='backslashreplace')
kunit_output = sys.stdin
else:
- with open(cli_args.file, 'r') as f:
+ with open(cli_args.file, 'r', errors='backslashreplace') as f:
kunit_output = f.read().splitlines()
request = KunitParseRequest(cli_args.raw_output,
None,
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
index faa6320e900e..f08c6c36a947 100644
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -135,7 +135,7 @@ class LinuxSourceTreeOperationsQemu(LinuxSourceTreeOperations):
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
- text=True, shell=True)
+ text=True, shell=True, errors='backslashreplace')

class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
"""An abstraction over command line operations performed on a source tree."""
@@ -172,7 +172,7 @@ class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
- text=True)
+ text=True, errors='backslashreplace')

def get_kconfig_path(build_dir) -> str:
return get_file_path(build_dir, KCONFIG_PATH)

base-commit: a032094fc1ed17070df01de4a7883da7bb8d5741
--
2.33.0.882.g93a45727a2-goog