Re: [PATCH v2] docs: Use make invocation's -j argument for parallelism
From: Jonathan Corbet
Date: Sun Sep 22 2019 - 16:03:53 EST
On Thu, 19 Sep 2019 14:44:37 -0700
Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> While sphinx 1.7 and later supports "-jauto" for parallelism, this
> effectively ignores the "-j" flag used in the "make" invocation, which
> may cause confusion for build systems. Instead, extract the available
What sort of confusion might we expect? Or, to channel akpm, "what are the
user-visible effects of this bug"?
> parallelism from "make"'s job server (since it is not exposed in any
> special variables) and use that for the "sphinx-build" run. Now things
> work correctly for builds where -j is specified at the top-level:
>
> make -j16 htmldocs
>
> If -j is not specified, continue to fallback to "-jauto" if available.
So this seems like a good thing to do. I do have a couple of small issues,
though...
[...]
> + -j $(shell python3 $(srctree)/scripts/jobserver-count $(SPHINX_PARALLEL)) \
This (and the shebang line in the script itself) will cause the docs build
to fail on systems lacking Python 3. While we have talked about requiring
Python 3 for the docs build, we have not actually taken that step yet. We
probably shouldn't sneak it in here. I don't see anything in the script
that should require a specific Python version, so I think it should be
tweaked to be version-independent and just invoke "python".
> -b $2 \
> -c $(abspath $(srctree)/$(src)) \
> -d $(abspath $(BUILDDIR)/.doctrees/$3) \
> diff --git a/scripts/jobserver-count b/scripts/jobserver-count
> new file mode 100755
> index 000000000000..ff6ebe6b0194
> --- /dev/null
> +++ b/scripts/jobserver-count
> @@ -0,0 +1,53 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0-or-later
By license-rules.rst, this should be GPL-2.0+
> +#
> +# This determines how many parallel tasks "make" is expecting, as it is
> +# not exposed via an special variables.
> +# https://www.gnu.org/software/make/manual/html_node/POSIX-Jobserver.html#POSIX-Jobserver
> +import os, sys, fcntl
> +
> +# Default parallelism is "1" unless overridden on the command-line.
> +default="1"
> +if len(sys.argv) > 1:
> + default=sys.argv[1]
> +
> +# Set non-blocking for a given file descriptor.
> +def nonblock(fd):
> + flags = fcntl.fcntl(fd, fcntl.F_GETFL)
> + fcntl.fcntl(fd, fcntl.F_SETFL, flags | os.O_NONBLOCK)
> + return fd
> +
> +# Extract and prepare jobserver file descriptors from envirnoment.
> +try:
> + # Fetch the make environment options.
> + flags = os.environ['MAKEFLAGS']
> +
> + # Look for "--jobserver=R,W"
> + opts = [x for x in flags.split(" ") if x.startswith("--jobserver")]
> +
> + # Parse out R,W file descriptor numbers and set them nonblocking.
> + fds = opts[0].split("=", 1)[1]
> + reader, writer = [nonblock(int(x)) for x in fds.split(",", 1)]
> +except:
So I have come to really dislike bare "except" clauses; I've seen them hide
too many bugs. In this case, perhaps it's justified, but still ... it bugs
me ...
> + # Any failures here should result in just using the default
> + # specified parallelism.
> + print(default)
> + sys.exit(0)
> +
> +# Read out as many jobserver slots as possible.
> +jobs = b""
> +while True:
> + try:
> + slot = os.read(reader, 1)
> + jobs += slot
> + except:
This one, I think, should be explicit; anything other than EWOULDBLOCK
indicates a real problem, right?
> + break
> +# Return all the reserved slots.
> +os.write(writer, jobs)
You made writer nonblocking, so it seems plausible that we could leak some
slots here, no? Does writer really need to be nonblocking?
> +# If the jobserver was (impossibly) full or communication failed, use default.
> +if len(jobs) < 1:
> + print(default)
> +
> +# Report available slots (with a bump for our caller's reserveration).
> +print(len(jobs) + 1)
The last question I have is...why is it that we have to do this complex
dance rather than just passing the "-j" option through directly to sphinx?
That comes down to the "confusion" mentioned at the top, I assume. It
would be good to understand that?
Thanks,
jon