Re: [PATCH v3] bpf/verifier: optimize precision backtracking by skipping precise bits

From: Eduard Zingerman

Date: Mon Jan 19 2026 - 13:43:40 EST


On Sat, 2026-01-17 at 18:09 +0800, Qiliang Yuan wrote:

Hi Qiliang,

> 2. System-wide saturation profiling (32 cores):
> # Start perf in background
> sudo perf stat -a -- sleep 60 &
> # Start 32 parallel loops of veristat
> for i in {1..32}; do (while true; do ./veristat backtrack_stress.bpf.o > /dev/null; done &); done

I'm not sure system-wide testing is helpful in this context.
I'd suggest collecting stats for a single process, e.g. as follows:

perf stat -B --all-kernel -r10 -- ./veristat -q pyperf180.bpf.o

(Note: pyperf180 is a reasonably complex test for many purposes).
And then collecting profiling data:

perf record -o <somewhere-where-mmap-is-possible> \
--all-kernel --call-graph=dwarf --vmlinux=<path-to-vmlinux> \
-- ./veristat -q pyperf180.bpf.o

And then inspecting the profiling data using `perf report`.
What I see in stats corroborates with Yonghong's findings:

W/o the patch:
...
22293282 branch-misses # 2.8 % branch_miss_rate ( +- 1.25% ) (50.10%)
594485451 branches # 1012.5 M/sec branch_frequency ( +- 1.68% ) (66.67%)
1544503960 cpu-cycles # 2.6 GHz cycles_frequency ( +- 0.18% ) (67.02%)
3305212994 instructions # 2.1 instructions insn_per_cycle ( +- 2.04% ) (67.11%)
587496908 stalled-cycles-frontend # 0.38 frontend_cycles_idle ( +- 1.21% ) (66.39%)

0.60033 +- 0.00173 seconds time elapsed ( +- 0.29% )

With the patch
...
22397789 branch-misses # 2.8 % branch_miss_rate ( +- 1.27% ) (50.37%)
596289399 branches # 1004.8 M/sec branch_frequency ( +- 1.59% ) (66.95%)
1546060617 cpu-cycles # 2.6 GHz cycles_frequency ( +- 0.16% ) (66.67%)
3325745352 instructions # 2.2 instructions insn_per_cycle ( +- 1.76% ) (66.61%)
588040713 stalled-cycles-frontend # 0.38 frontend_cycles_idle ( +- 1.23% ) (66.48%)

0.60697 +- 0.00201 seconds time elapsed ( +- 0.33% )

So, I'd suggest shelving this change for now.

If you take a look at the profiling data, you'd notice that low
hanging fruit is actually improving bpf_patch_insn_data(),
It takes ~40% of time, at-least for this program.
This was actually discussed a very long time ago [1].
If you are interested in speeding up verifier,
maybe consider taking a look?

Best regards,
Eduard Zingerman.

[1] https://lore.kernel.org/bpf/CAEf4BzY_E8MSL4mD0UPuuiDcbJhh9e2xQo2=5w+ppRWWiYSGvQ@xxxxxxxxxxxxxx/