Re: [PATCH] kunit: tool: yield output from run_kernel in real time

From: Daniel Latypov
Date: Mon Oct 04 2021 - 21:17:33 EST


On Mon, Oct 4, 2021 at 6:08 PM David Gow <davidgow@xxxxxxxxxx> wrote:
>
> On Tue, Oct 5, 2021 at 8:36 AM Daniel Latypov <dlatypov@xxxxxxxxxx> wrote:
> >
> > On Mon, Oct 4, 2021 at 4:58 PM David Gow <davidgow@xxxxxxxxxx> wrote:
> > >
> > > On Tue, Oct 5, 2021 at 7:46 AM Daniel Latypov <dlatypov@xxxxxxxxxx> wrote:
> > > >
> > > > On Mon, Oct 4, 2021 at 4:34 PM 'David Gow' via KUnit Development
> > > > <kunit-dev@xxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Tue, Oct 5, 2021 at 1:01 AM Daniel Latypov <dlatypov@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > Currently, `run_kernel()` dumps all the kernel output to a file
> > > > > > (.kunit/test.log) and then opens the file and yields it to callers.
> > > > > > This made it easier to respect the requested timeout, if any.
> > > > > >
> > > > > > But it means that we can't yield the results in real time, either to the
> > > > > > parser or to stdout (if --raw_output is set).
> > > > > >
> > > > > > This change spins up a background thread to enforce the timeout, which
> > > > > > allows us to yield the kernel output in real time, while also copying it
> > > > > > to the .kunit/test.log file.
> > > > > > It's also careful to ensure that the .kunit/test.log file is complete,
> > > > > > even in the kunit_parser throws an exception/otherwise doesn't consume
> > > > > > every line, see the new `finally` block and unit test.
> > > > > >
> > > > > > For example:
> > > > > >
> > > > > > $ ./tools/testing/kunit/kunit.py run --arch=x86_64 --raw_output
> > > > > > <configure + build steps>
> > > > > > ...
> > > > > > <can now see output from QEMU in real time>
> > > > > >
> > > > > > This does not currently have a visible effect when --raw_output is not
> > > > > > passed, as kunit_parser.py currently only outputs everything at the end.
> > > > > > But that could change, and this patch is a necessary step towards
> > > > > > showing parsed test results in real time.
> > > > > >
> > > > > > Signed-off-by: Daniel Latypov <dlatypov@xxxxxxxxxx>
> > > > > > ---
> > > > > > tools/testing/kunit/kunit_kernel.py | 73 +++++++++++++++-----------
> > > > > > tools/testing/kunit/kunit_tool_test.py | 17 ++++++
> > > > > > 2 files changed, 60 insertions(+), 30 deletions(-)
> > > > > >
> > > > > > diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
> > > > > > index 2c6f916ccbaf..b8cba8123aa3 100644
> > > > > > --- a/tools/testing/kunit/kunit_kernel.py
> > > > > > +++ b/tools/testing/kunit/kunit_kernel.py
> > > > > > @@ -12,7 +12,8 @@ import subprocess
> > > > > > import os
> > > > > > import shutil
> > > > > > import signal
> > > > > > -from typing import Iterator, Optional, Tuple
> > > > > > +import threading
> > > > > > +from typing import Iterator, List, Optional, Tuple
> > > > > >
> > > > > > from contextlib import ExitStack
> > > > > >
> > > > > > @@ -103,8 +104,8 @@ class LinuxSourceTreeOperations(object):
> > > > > > if stderr: # likely only due to build warnings
> > > > > > print(stderr.decode())
> > > > > >
> > > > > > - def run(self, params, timeout, build_dir, outfile) -> None:
> > > > > > - pass
> > > > > > + def start(self, params: List[str], build_dir: str) -> subprocess.Popen:
> > > > > > + raise RuntimeError('not implemented!')
> > > > > >
> > > > > >
> > > > > > class LinuxSourceTreeOperationsQemu(LinuxSourceTreeOperations):
> > > > > > @@ -123,7 +124,7 @@ class LinuxSourceTreeOperationsQemu(LinuxSourceTreeOperations):
> > > > > > kconfig.parse_from_string(self._kconfig)
> > > > > > base_kunitconfig.merge_in_entries(kconfig)
> > > > > >
> > > > > > - def run(self, params, timeout, build_dir, outfile):
> > > > > > + def start(self, params: List[str], build_dir: str) -> subprocess.Popen:
> > > > > > kernel_path = os.path.join(build_dir, self._kernel_path)
> > > > > > qemu_command = ['qemu-system-' + self._qemu_arch,
> > > > > > '-nodefaults',
> > > > > > @@ -134,18 +135,10 @@ class LinuxSourceTreeOperationsQemu(LinuxSourceTreeOperations):
> > > > > > '-nographic',
> > > > > > '-serial stdio'] + self._extra_qemu_params
> > > > > > print('Running tests with:\n$', ' '.join(qemu_command))
> > > > > > - with open(outfile, 'w') as output:
> > > > > > - process = subprocess.Popen(' '.join(qemu_command),
> > > > > > - stdin=subprocess.PIPE,
> > > > > > - stdout=output,
> > > > > > - stderr=subprocess.STDOUT,
> > > > > > - text=True, shell=True)
> > > > > > - try:
> > > > > > - process.wait(timeout=timeout)
> > > > > > - except Exception as e:
> > > > > > - print(e)
> > > > > > - process.terminate()
> > > > > > - return process
> > > > > > + return subprocess.Popen(' '.join(qemu_command),
> > > > > > + stdout=subprocess.PIPE,
> > > > > > + stderr=subprocess.STDOUT,
> > > > > > + text=True, shell=True)
> > > > > >
> > > > > > class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
> > > > > > """An abstraction over command line operations performed on a source tree."""
> > > > > > @@ -175,17 +168,13 @@ class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
> > > > > > kunit_parser.print_with_timestamp(
> > > > > > 'Starting Kernel with all configs takes a few minutes...')
> > > > > >
> > > > > > - def run(self, params, timeout, build_dir, outfile):
> > > > > > + def start(self, params: List[str], build_dir: str) -> subprocess.Popen:
> > > > > > """Runs the Linux UML binary. Must be named 'linux'."""
> > > > > > linux_bin = get_file_path(build_dir, 'linux')
> > > > > > - outfile = get_outfile_path(build_dir)
> > > > > > - with open(outfile, 'w') as output:
> > > > > > - process = subprocess.Popen([linux_bin] + params,
> > > > > > - stdin=subprocess.PIPE,
> > > > >
> > > > > This breaks --raw_output under UML for me. Including the
> > > > > stdin=subprocess.PIPE again seems to fix it.
> > > >
> > > > Can you give an example of what it does?
> > > >
> > > > I don't see any issues with --raw_output under UML with the patch as-is.
> > > > I was mainly testing this with UML, and I must have ran it some 10s of
> > > > times, so I'm a bit surprised.
> > > >
> > > > On an earlier version, I saw some mangling of --raw_output (\n was
> > > > missing), but that went away after some revisions.
> > > >
> > >
> > > Yeah, that's the sort of thing I'm seeing: \n being treated as just a
> > > new line (without the carriage return).
> > > It happens pretty consistently, though sometimes the text wraps and
> > > sometimes (well, once) everything gets forced into the last column of
> > > the terminal. I've not been able to get it to work at all without
> > > having stdin be subprocess.PIPE.
> >
> > I can't repro in the shell I was using while writing this.
> > Starting up a new session, I get:
> > * good
> > * bad x3
> > * good
> > * bad x3
> > * good
> > * bad x3
> > * bad, but not as bad (phew, I thought this was cursed)
> >
> > .kunit/test.log contains the output with proper \n each time I checked.
> >
> > Doesn't seem to repro when redirected into a file:
> >
> > $ ./tools/testing/kunit/kunit.py exec --raw_output > /tmp/out && diff
> > /tmp/out .kunit/test.log | grep -m1 '^>'
> >
> > Piping it through cat also seems 100% fine:
> > $ ./tools/testing/kunit/kunit.py exec --raw_output | cat
> >
> > So having stdin be subprocess.PIPE sorta makes sense in some twisted way.
> > I can also now repro that setting stdin=subprocess.PIPE seems to fix it.
> >
> > I don't want to put back subprocess.PIPE, but I'm not confident in
> > being able to find a better solution...
>
> So it turns out that there are a few issues with UML assuming stdin ==
> stdout (or at least that stdin and stdout are related), as seen in
> this ongoing thread on linux-um:
> http://lists.infradead.org/pipermail/linux-um/2021-October/001805.html

Ah, that sounds very related.

>
> Some of the coments in UML's chan_user.c seem pretty related, too, in
> that termios is used to set this weird
> "mostly-raw-but-\n-includes-carriage-returns" mode:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/um/drivers/chan_user.c#n90
>
> My theory is that UML sets "raw" mode based on what stdin is, then
> only sets OPOST if stdout is a tty, so there's a mismatch here. So,
> probably this is fundamentally a UML bug, but worth us working around
> in it kunit_tool for the time being. That being said, I've not been
> able to find anything obviously stdin-y setting raw mode thus far, so
> I could be wrong.
>
> My preference is that we put stdin=subprocess.PIPE back for now, since
> we don't want a regression, and if a cause in UML is found and fixed,
> we can remove it after the fix has propagated comfortably everywhere.
>

Agreed.
I sent out v2 here:
https://lore.kernel.org/linux-kselftest/20211005011340.2826268-1-dlatypov@xxxxxxxxxx

Based on this, we could probably drop it for the QEMU code path, but
I'll leave stdin untouched for both in the patch for consistency.

We can followup later and
* drop it from QEMU + add a comment/TODO on UML
* or just revist if/when we've found a fix and drop it from both.

> -- David
>
> >
> > >
> > > It occurs both under tmux and not, and under Konsole and xterm, so it
> > > doesn't appear to be specific to any given terminal implementation.
> > > Still occurs even after running 'reset', and with a clean build.
> > > QEMU-based --raw_output works fine.
> > >
> > >