Re: [GIT PULL] pin control bulk changes for v4.16
From: Ingo Molnar
Date: Sat Feb 03 2018 - 05:46:19 EST
* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> I also wonder if there are any automated tools that try to find these
> kinds of crazy things. I suspect a lot of our build times is the poor
> compiler just reading and parsing header files over and over again,
> and a lot of them are probably not needed.
Yes. I'd guesstimate that in a typical defconfig kernel build the compiler is
building at least 10x as much as it should with a cleaner header file layout,
based on preprocessed source code file sizes.
Interestingly the .i file linecount difference isn't all that large between
allmodconfig and allnoconfig kernels - which I think is further proof of our
'header spaghetti bloat' problems.
While central files like fork.c or sched/*.c are expected to have a lot of
dependencies, we also have a lot of bloat if we build much more isolated,
standalone core kernel functionality:
# allmodconfig:
triton:~/tip> wc -l kernel/task_work.i
43522 kernel/task_work.i
# allnoconfig:
triton:~/tip> wc -l kernel/task_work.i
37123 kernel/task_work.i
# source code size:
triton:~/tip> wc -l kernel/task_work.c
118 kernel/task_work.c
We are bringing in 37 KLOC of headers to build a 0.1 KLOC .c file ...
> A year ago, Ingo did patches limit some of the header file issues for
> the core headers (<linux/sched.h> in particular). Maybe he had
> tooling? Ingo?
No, unfortunately I didn't use much tooling: I only used simple manual tools like
'grep' and small ad-hoc shell scripts to discover some of the deeper dependencies
(long lost - nor was there any real value in them). What I relied on mostly was
randconfig build coverage.
In that sched.h split-up effort a year ago I literally removed/moved the
prototypes and header files one by one and tried to see what breaks. If the
breakage was too widespread I tried to grep.
But based on the sched.h experiment I do think our kernel build times could be
significantly improved by organizing the headers better. Splitting up sched.h also
improved readability and maintainability, so it was a win-win all around.
With a more aggressive reorganization of our header architecture I believe we
could achieve a more than 5x improvement in kernel build times (!) - but that
would involve some trade-offs for header maintainability: a finer grained
hierarchy is somewhat harder to maintain.
With extreme measures that would involve runtime performance trade-offs as well
(to get rid of excessive inlining cross-dependencies) we could possibly achieve a
30x improvement in kernel compilation times: the build would be link time and
build parallelism limited on most systems.
Thanks,
Ingo