Re: [PATCH] perf symbols: fix slowness due to -ffunction-section

From: Arnaldo Carvalho de Melo
Date: Wed Nov 21 2018 - 16:27:06 EST


Em Wed, Nov 21, 2018 at 09:51:19AM -0800, Eric Saint-Etienne escreveu:
> Perf can take minutes to parse an image when -ffunction-section is used.
> This is especially true with the kernel image when it is compiled this way,
> which is the arm64 default since the patcheset "Enable deadcode elimination
> at link time".
>
> Perf organize maps using a rbtree. Whenever perf finds a new symbols, it
> first searches this rbtree for the map it belongs to, by strcmp()'aring
> section names. When it finds the map with the right name, it uses it to
> add the symbol. With a usual image there aren't so many maps but when using
> -ffunction-section there's basically one map per function.
> With the kernel image that's north of 40,000 maps. For most symbols perf
> has to parses the entire rbtree to eventually create a new map and add it.
> Consequently perf spends most of the time browsing a rbtree that keeps
> getting larger.
>
> This performance fix introduces a secondary rbtree that indexes maps based
> on the section name.
>
> Signed-off-by: Eric Saint-Etienne <eric.saint.etienne@xxxxxxxxxx>
> Reviewed-by: Dave Kleikamp <dave.kleikamp@xxxxxxxxxx>
> Reviewed-by: David Aldridge <david.aldridge@xxxxxxxxxx>
> Reviewed-by: Rob Gardner <rob.gardner@xxxxxxxxxx>

Looks sane, thanks to the multiple reviewers, really appreciated,

Applied.

- Arnaldo