Re: [PATCH v4 04/11] x86/bhi: Make clear_bhb_loop() effective on newer CPUs

From: Dave Hansen

Date: Fri Nov 21 2025 - 11:40:45 EST


On 11/19/25 22:18, Pawan Gupta wrote:
> - CLEAR_BHB_LOOP_SEQ 5, 5
> + /* loop count differs based on CPU-gen, see Intel's BHI guidance */
> + ALTERNATIVE (CLEAR_BHB_LOOP_SEQ 5, 5), \
> + __stringify(CLEAR_BHB_LOOP_SEQ 12, 7), X86_FEATURE_BHI_CTRL

There are a million ways to skin this cat. But I'm not sure I really
like the end result here. It seems a little overkill to use ALTERNATIVE
to rewrite a whole sequence just to patch two constants in there.

What if the CLEAR_BHB_LOOP_SEQ just took its inner and outer loop counts
as register arguments? Then this would look more like:

ALTERNATIVE "mov $5, %rdi; mov $5, %rsi",
"mov $12, %rdi; mov $7, %rsi",
...

CLEAR_BHB_LOOP_SEQ

Or, even global variables:

mov outer_loop_count(%rip), %rdi
mov inner_loop_count(%rip), %rsi

and then have some C code somewhere that does:

if (cpu_feature_enabled(X86_FEATURE_BHI_CTRL)) {
outer_loop_count = 5;
inner_loop_count = 5;
} else {
outer_loop_count = 12;
inner_loop_count = 7;
}

... and I'm sure I got something wrong in there like flipping the
inner/outer counts, and I'm not even thinking about the variable types.

But, basically, I think I want to avoid as much logic as possible in
assembly. I also think we should reserve ALTERNATIVE for things that
truly need it, like things that are truly performance sensitive or that
can't reach out and poke at variables.

Peter Z. usually has good instincts on these things, so I'm curious what
he thinks of all this.