Re: [PATCH 2/3] perf tools: Spare double comparison of callchain first entry

From: Namhyung Kim
Date: Wed Jan 15 2014 - 01:24:16 EST


On Tue, 14 Jan 2014 16:37:15 +0100, Frederic Weisbecker wrote:
> When a new callchain child branch matches an existing one in the rbtree,
> the comparison of its first entry is performed twice:
>
> 1) From append_chain_children() on branch lookup
>
> 2) If 1) reports a match, append_chain() then compares all entries of
> the new branch against the matching node in the rbtree, and this
> comparison includes the first entry of the new branch again.

Right.

>
> Lets shortcut this by performing the whole comparison only from
> append_chain() which then returns the result of the comparison between
> the first entry of the new branch and the iterating node in the rbtree.
> If the first entry matches, the lookup on the current level of siblings
> stops and propagates to the children of the matching nodes.

Hmm.. it looks like that I thought directly calling append_chain() has
some overhead - but it's not.

>
> This results in less comparisons performed by the CPU.

Do you have any numbers? I suspect it'd not be a big change, but just
curious.

>
> Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>

Reviewed-by: Namhyung Kim <namhyung@xxxxxxxxxx>

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/