Re: RFC: Link Time Optimization support for the kernel

From: Jan Hubicka
Date: Thu Aug 23 2012 - 11:13:03 EST


> > If data structures could be encapsulated/internalized to
> > subsystems and only global functions are exposed to other
> > subsystems [which are then LTO optimized] then our include
> > file dependencies could become a *lot* simpler.
>
> Yes, long term we could have these benefits.

Yes, LTO should make in long term life of developers easier, it is just not tool
how to get few extra % of performance.
There is a lot to do.
>
> BTW I should add LTO does more than just inlining:
> - Drop unused global functions and variables
> (so may cut down on ifdefs)
> - Detect type inconsistencies between files
> - Partial inlining (inline only parts of a function like a test
> at the beginning)
> - Detect pure and const functions without side effects that can be more
> aggressively optimized in the caller.
Also noreturn and nothorw are autodetected (the second is probably not big deal
for kernel, but it makes some C++ codebases a lot smaller by elliminating EH
and cleanps). We plan to add more in near future.
> - Detect global clobbers globally. Normally any global call has to
> assume all global variables could be changed. With LTO information some
> of them can be cached in registers over calls.
> - Detect read only variables and optimize them
> - Optimize arguments to global functions (drop unnecessary arguments,
> optimize input/output etc.)

At this moment this really happen s within compilation units only.
It is one of harder optimizations to get working over whole program,
we are slowly getting infrasrtucture to make this possible.

> - Replace indirect calls with direct calls, enabling other
> optimizations.
> - Do constant propagation and specialization for functions. So if a
> function is called commonly with a constant it can generate a special
> variant of this function optimized for that. This still needs more tuning (and
> currently the code size impact is on the largish side), but I hope
> to eventually have e.g. a special kmalloc optimized for GFP_KERNEL.
> It can also in principle inline callbacks.

Also profile propagation is done. When function is called only on cold paths, it becomes
cold.

Thanks for all the hard work on LTO kernel, Andi!
Honza
>
> -Andi
> --
> ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/