Re: [PATCH v2] docs: drop the version constraints for sphinx and dependencies

From: Donald Hunter
Date: Mon Mar 18 2024 - 12:45:14 EST


Lukas Bulwahn <lukas.bulwahn@xxxxxxxxx> writes:

> As discussed (see Links), there is some inertia to move to the recent
> Sphinx versions for the doc build environment.
>
> [...]
>
> Link: https://lore.kernel.org/linux-doc/874jf4m384.fsf@xxxxxxxxxxxx/
> Link: https://lore.kernel.org/linux-doc/20240226093854.47830-1-lukas.bulwahn@xxxxxxxxx/
> Reviewed-by: Akira Yokosawa <akiyks@xxxxxxxxx>
> Tested-by: Vegard Nossum <vegard.nossum@xxxxxxxxxx>
> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@xxxxxxxxx>
> ---
> v1 -> v2:
> drop jinja2 as suggested by Vegard.
> add tags from v1 review
>
> Documentation/doc-guide/sphinx.rst | 11 ++++++-----
> Documentation/sphinx/requirements.txt | 7 ++-----
> scripts/sphinx-pre-install | 19 +++----------------
> 3 files changed, 11 insertions(+), 26 deletions(-)

Apologies if I am a little late to the party here - I am just catching
up with the changes on docs-next.

I went to install Sphinx 2.4.4 using requirements.txt for some doc work
and hit the upstream Sphinx dependency breakage. So I pulled docs-next
with the intention of sending a patch to requirements.txt with pinned
dependences. When I noticed that things have already moved on in
docs-next, I decided to spend some time investigating the performance
regression that has been present in Sphinx from 3.0.0 until now.

With Sphinx 2.4.4 I always get timings in this ballpark:

% time make htmldocs
..
real 4m5.417s
user 17m0.379s
sys 1m11.889s

With Sphinx 7.2.6 it's typically over 9 minutes:

% time make htmldocs
..
real 9m0.533s
user 15m38.397s
sys 1m0.907s

I collected profiling data using cProfile:

export srctree=`pwd`
export BUILDDIR=`pwd`/Documentation/output
python3 -m cProfile -o profile.dat ./sphinx_latest/bin/sphinx-build \
-b html \
-c ./Documentation \
-d ./Documentation/output/.doctrees \
-D version=6.8.0 -D release= \
-D kerneldoc_srctree=. -D kerneldoc_bin=./scripts/kernel-doc \
./Documentation \
./Documentation/output

Here's some of the profiling output:

$ python3 -m pstats profile.dat
Welcome to the profile statistics browser.
profile.dat% sort tottime
profile.dat% stats 10
Fri Mar 15 17:09:39 2024 profile.dat

3960680702 function calls (3696376639 primitive calls) in 1394.384 seconds

Ordered by: internal time
List reduced from 6733 to 10 due to restriction <10>

ncalls tottime percall cumtime percall filename:lineno(function)
770364892 165.102 0.000 165.102 0.000 sphinx/domains/c.py:153(__eq__)
104124 163.968 0.002 544.788 0.005 sphinx/domains/c.py:1731(_find_named_symbols)
543888397 123.767 0.000 176.685 0.000 sphinx/domains/c.py:1679(children_recurse_anon)
4292 74.081 0.017 74.081 0.017 {method 'poll' of 'select.poll' objects}
631233096 69.389 0.000 246.017 0.000 sphinx/domains/c.py:1746(candidates)
121406721/3359598 65.689 0.000 76.762 0.000 docutils/nodes.py:202(_fast_findall)
3477076 64.387 0.000 65.758 0.000 sphinx/util/nodes.py:633(_copy_except__document)
544032973 52.950 0.000 52.950 0.000 sphinx/domains/c.py:156(is_anon)
79012597/3430 36.395 0.000 36.395 0.011 sphinx/domains/c.py:1656(clear_doc)
286882978 31.271 0.000 31.279 0.000 {built-in method builtins.isinstance}

profile.dat% callers c.py:153
Ordered by: internal time
List reduced from 6733 to 4 due to restriction <'c.py:153'>

Function was called by...
ncalls tottime cumtime
sphinx/domains/c.py:153(__eq__) <- 631153346 134.803 134.803 sphinx/domains/c.py:1731(_find_named_symbols)
154878 0.041 0.041 sphinx/domains/c.py:2085(find_identifier)
139056533 30.259 30.259 sphinx/domains/c.py:2116(direct_lookup)
135 0.000 0.000 sphinx/util/cfamily.py:89(__eq__)

>From that you can see there is a significant call amplification from
_find_named_symbols (100k calls) to __eq__ (630 million calls), plus
several other expensive functions. Looking at the code [1], you can see
why. It's doing a list walk to find matching symbols. When adding new
symbols it does an exhaustive walk to check for duplicates, so you get
worst-case performance, with ~13k symbols in a list during the doc
build.

I have an experimental fix that uses a dict for lookups. With the fix, I
consistently get times in the sub 5 minute range:

% time make htmldocs
..
real 4m27.085s
user 10m56.985s
sys 0m56.385s

I expect there are other speedups to be found. I will clean up my Sphinx
changes and share them on a GitHub branch (as well as push them
upstream) so that others can try them out.

For some reason, if I run sphinx-build manually with -j 12 (I have a 12
core machine) I get better performance than make htmldocs:

% sphinx-build -j 12 ...
..
real 3m56.074s
user 9m52.775s
sys 0m52.905s

I haven't had a chance to look at what makes the difference here, but
will investigate when I have time.

Cheers,
Donald.

[1] https://github.com/sphinx-doc/sphinx/blob/ff252861a7b295e8dd8085ea9f6ed85e085273fc/sphinx/domains/c/_symbol.py#L235-L283

> diff --git a/Documentation/sphinx/requirements.txt b/Documentation/sphinx/requirements.txt
> index 5d47ed443949..5017f307c8a4 100644
> --- a/Documentation/sphinx/requirements.txt
> +++ b/Documentation/sphinx/requirements.txt
> @@ -1,6 +1,3 @@
> -# jinja2>=3.1 is not compatible with Sphinx<4.0
> -jinja2<3.1
> -# alabaster>=0.7.14 is not compatible with Sphinx<=3.3
> -alabaster<0.7.14
> -Sphinx==2.4.4
> +alabaster
> +Sphinx
> pyyaml