Re: [PATCH v2 0/4] tracing: Have filters and histograms use a call table instead of pointers

From: Google
Date: Fri Sep 09 2022 - 08:05:32 EST


On Tue, 06 Sep 2022 18:53:13 -0400
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> While looking at the histogram and filter code, I realized that it's filled
> with function pointers. With retpolines causing a big slowdown, I thought
> that was problematic. Thus, I decided to see what would happen if I changed
> the function pointers into enums, and instead called a single function
> that did a switch on those enums and called the necessary functions
> directly. The results were pretty clear.
>
> The first patch was to update the trace event benchmark event to include
> a integer value "delta" of the delta that it took to complete
> (it currently only shows the delta as part of a string). By doing
> so, I could benchmark the histogram and filter logic with it.
>
> Before this series, the histogram with a single filter (to ignore the
> first event, which has a delta of zero), had:
>
> # event histogram
> #
> # trigger info: hist:keys=delta:vals=hitcount:sort=delta:size=2048 if delta > 0 [active]
> #
>
> { delta: 129 } hitcount: 2213
> { delta: 130 } hitcount: 285965
> { delta: 131 } hitcount: 1146545
> { delta: 132 } hitcount: 5185432
> { delta: 133 } hitcount: 19896215
> { delta: 134 } hitcount: 53118616
> { delta: 135 } hitcount: 83816709
> { delta: 136 } hitcount: 68329562
> { delta: 137 } hitcount: 41859349
> { delta: 138 } hitcount: 46257797
> { delta: 139 } hitcount: 54400831
> { delta: 140 } hitcount: 72875007
> { delta: 141 } hitcount: 76193272
> { delta: 142 } hitcount: 49504263
> { delta: 143 } hitcount: 38821072
> { delta: 144 } hitcount: 47702679
> { delta: 145 } hitcount: 41357297
> { delta: 146 } hitcount: 22058238
> { delta: 147 } hitcount: 9720002
> { delta: 148 } hitcount: 3193542
> { delta: 149 } hitcount: 927030
> { delta: 150 } hitcount: 850772
> { delta: 151 } hitcount: 1477380
> { delta: 152 } hitcount: 2687977
> { delta: 153 } hitcount: 2865985
> { delta: 154 } hitcount: 1977492
> { delta: 155 } hitcount: 2475607
> { delta: 156 } hitcount: 3403612
> { delta: 157 } hitcount: 2264011
> { delta: 158 } hitcount: 1096214
> { delta: 159 } hitcount: 504653
> { delta: 160 } hitcount: 218869
> { delta: 161 } hitcount: 103246
> [..]
>
> Where the bulk was around 142ns, and the fastest time was 129ns.
>
> After this series:
>
> # event histogram
> #
> # trigger info: hist:keys=delta:vals=hitcount:sort=delta:size=2048 if delta > 0 [active]
> #
>
> { delta: 103 } hitcount: 60
> { delta: 104 } hitcount: 16966
> { delta: 105 } hitcount: 396625
> { delta: 106 } hitcount: 3223400
> { delta: 107 } hitcount: 12053754
> { delta: 108 } hitcount: 20241711
> { delta: 109 } hitcount: 14850200
> { delta: 110 } hitcount: 4946599
> { delta: 111 } hitcount: 3479315
> { delta: 112 } hitcount: 18698299
> { delta: 113 } hitcount: 62388733
> { delta: 114 } hitcount: 95803834
> { delta: 115 } hitcount: 58278130
> { delta: 116 } hitcount: 15364800
> { delta: 117 } hitcount: 5586866
> { delta: 118 } hitcount: 2346880
> { delta: 119 } hitcount: 1131091
> { delta: 120 } hitcount: 620896
> { delta: 121 } hitcount: 236652
> { delta: 122 } hitcount: 105957
> { delta: 123 } hitcount: 119107
> { delta: 124 } hitcount: 54494
> { delta: 125 } hitcount: 63856
> { delta: 126 } hitcount: 64454
> { delta: 127 } hitcount: 34818
> { delta: 128 } hitcount: 41446
> { delta: 129 } hitcount: 51242
> { delta: 130 } hitcount: 28361
> { delta: 131 } hitcount: 23926
> { delta: 132 } hitcount: 22253
> { delta: 133 } hitcount: 16994
> { delta: 134 } hitcount: 14970
> { delta: 135 } hitcount: 13464
> { delta: 136 } hitcount: 11452
> { delta: 137 } hitcount: 12212
> { delta: 138 } hitcount: 12280
> { delta: 139 } hitcount: 9127
> { delta: 140 } hitcount: 9553
>
> Where the bulk was around 114ns and the fast time was 103ns.
>
> That's almost a 20% speedup!!!
>
> Changes since v1: https://lore.kernel.org/all/20220823214606.344269352@xxxxxxxxxxx/
>
> - Fixed combining constants (Masami Hiramatsu)

This looks good to me and confirmed that the previous issue is solved.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>
Tested-by: Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>

Thank you!

>
> Steven Rostedt (Google) (4):
> tracing: Add numeric delta time to the trace event benchmark
> tracing/hist: Call hist functions directly via a switch statement
> tracing: Move struct filter_pred into trace_events_filter.c
> tracing/filter: Call filter predicate functions directly via a switch statement
>
> ----
> kernel/trace/trace.h | 13 --
> kernel/trace/trace_benchmark.c | 2 +-
> kernel/trace/trace_benchmark.h | 8 +-
> kernel/trace/trace_events_filter.c | 239 ++++++++++++++++++++++++-----------
> kernel/trace/trace_events_hist.c | 246 +++++++++++++++++++++++++------------
> 5 files changed, 343 insertions(+), 165 deletions(-)


--
Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>