Re: [PATCH v4 00/28] objtool: Function validation tracing

From: Alexandre Chartre

Date: Fri Nov 14 2025 - 04:57:14 EST



On 11/14/25 02:48, Josh Poimboeuf wrote:
On Thu, Nov 13, 2025 at 05:48:49PM +0100, Alexandre Chartre wrote:
Changes:
========

V4:
---
This version fixes a build issue when disassembly is not available. Compared
with V3, this is addresses by changes in patch 14 (objtool: Improve tracing
of alternative instructions). Other patches are similar to V3.

For the next revision, please base on tip/master, as there are some
major objtool changes pending for the next merge window.

Ok, I will rebase the next revision on tip/master.


Most of my comments below are bikeshedding, they are not required for
the next revision and can be addressed in followup patch sets if you'd
rather do it that way.

If changes are simple, I will try to address them immediately otherwise
defer to next patches.


- Each alternative of a group alternative is displayed with its feature
name and flags: <flags><feature-name>

<flags> is made of the following characters:

'!' : ALT_FLAG_NOT
'+' : ALT_FLAG_DIRECT_CALL
'?' : unknown flag (i.e. any other flags)

Other than '!', the meaning of the flags isn't intuitive. Maybe it
should just show the source code names:

ALT_NOT(X86_FEATURE_FOO)

ALT_DIRECT_CALL(X86_FEATURE_BAR)

ALT_UNKNOWN_FLAG(X86_FEATURE_BAZ)


I think '?' is meaningful too, but I wasn't sure about '+'.

I am using single characters to keep the alternative name short. It can already
be fairly long because of the feature name (like "X86_FEATURE_SPEC_STORE_BYPASS_DISABLE")

Also I am assuming that flags can be combined (although that's not currently
the case) so that would be more difficult with full ALT_* names and the
result would be much longer.


- If an alternative is a jump table then "JUMP" is used as the feature
name.

Hm, it's a bit confusing to label a jump label as an "alternative" as
those are two distinct things (though I'm aware that objtool conflates
the two).

- If an alternative is an exception table then "EXCEPTION" is used as the
feature name.

Ditto.


Yes, the wording is not good, I use it just because objtool handles jump
labels and exception tables as alternative. I will reword to something
better.


Disassembly can show default alternative jumping to .altinstr_aux
-----------------------------------------------------------------
Disassembly can show a default alternative jumping to .altinstr_aux. This
happens when the _static_cpu_has() function is used. Its default code
jumps to .altinstr_aux where a test sequence is executed (test; jnz; jmp).

At runtime, this sequence is not used because the _static_cpu_has()
an alternative with the X86_FEATURE_ALWAYS feature.


debc: perf_get_x86_pmu_capability+0xc jmpq 0xdec1 <.altinstr_aux+0xfc> | NOP5 (X86_FEATURE_HYBRID_CPU) | jmpq 0x61a <perf_get_x86_pmu_capability+0x37> (X86_FEATURE_ALWAYS) # <alternative.debc>

I'm finding this one-line format considerably more difficult to parse
than the slightly longer two-line form:

debc: perf_get_x86_pmu_capability+0xc <alternative.debc> | X86_FEATURE_HYBRID_CPU | X86_FEATURE_ALWAYS
debc: perf_get_x86_pmu_capability+0xc jmpq 0xdec1 <.altinstr_aux+0xfc> | NOP5 | jmpq 0x61a <perf_get_x86_pmu_capability+0x37>


Another option could be:

debc: perf_get_x86_pmu_capability+0xc jmpq 0xdec1 <.altinstr_aux+0xfc> (<alternative.debc>) |
NOP5 (X86_FEATURE_HYBRID_CPU) |
jmpq 0x61a <perf_get_x86_pmu_capability+0x37> (X86_FEATURE_ALWAYS)

I think I will use this option when displaying alternative one after the other,
and your suggestion when displaying side-by-side, and add an option to select
the display.


Also, I wonder if we can make NOP5 lowercase (nop5), since it really is
just an instruction, not something special like a feature.

This indicates that this is a pseudo instruction, NOP5 is actually nopl 0x00(%eax,%eax,1).
Even NOP1 can be a simple nop but also xchg %rax,%rax.


Disassembly can show alternative jumping to the next instruction
----------------------------------------------------------------

The disassembly can show jump tables with an alternative which jumps
to the next instruction.

For example:

def9: perf_get_x86_pmu_capability+0x49 NOP2 | jmp defb <perf_get_x86_pmu_capability+0x4b> (JUMP) # <alternative.def9>

I'm also struggling to read this one.

Maybe this needs a two-line form as well:

def9: perf_get_x86_pmu_capability+0x49 <static_branch.def9> |
def9: perf_get_x86_pmu_capability+0x49 NOP2 | jmp defb <perf_get_x86_pmu_capability+0x4b>
I will do something similar as suggested above.


Example 2 (--disas option): Single Instruction Alternatives
-----------------------------------------------------------

I would like to convert this to a dedicated "disas" subcommand which can
be run like "objtool disas <func>" or so. But again that can probably
be done in a followup.

Ok, I will look at it.


Example 3 (--disas option): Alternatives with multiple instructions
-------------------------------------------------------------------
Alternatives with multiple instructions are displayed side-by-side, with
an header describing the alternative. The code in the first column is the
default code of the alternative.


$ ./tools/objtool/objtool --disas=__switch_to_asm --link vmlinux.o
__switch_to_asm:
82c0: __switch_to_asm+0x0 push %rbp
82c1: __switch_to_asm+0x1 push %rbx
82c2: __switch_to_asm+0x2 push %r12
82c4: __switch_to_asm+0x4 push %r13
82c6: __switch_to_asm+0x6 push %r14
82c8: __switch_to_asm+0x8 push %r15
82ca: __switch_to_asm+0xa mov %rsp,0x1670(%rdi)
82d1: __switch_to_asm+0x11 mov 0x1670(%rsi),%rsp
82d8: __switch_to_asm+0x18 mov 0xad8(%rsi),%rbx
82df: __switch_to_asm+0x1f mov %rbx,%gs:0x0(%rip) # 0x82e7 <__stack_chk_guard>
82e7: __switch_to_asm+0x27 | <alternative.82e7> | !X86_FEATURE_ALWAYS | X86_FEATURE_RSB_CTXSW

Are the alternatives swapped? I believe this comes from the following
code, so the !X86_FEATURE_ALWAYS column should be last?

.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req ftr2=ALT_NOT(X86_FEATURE_ALWAYS)
ALTERNATIVE_2 "jmp .Lskip_rsb_\@", \
__stringify(__FILL_RETURN_BUFFER(\reg,\nr)), \ftr, \
__stringify(nop;nop;__FILL_ONE_RETURN), \ftr2

.Lskip_rsb_\@:
.endm

I will check but I process/print alternative in the order provided by
objtool (in struct alternative)


82e7: __switch_to_asm+0x27 | jmp 0x8312 <__switch_to_asm+0x52> | NOP1 | mov $0x10,%r12
82e8: __switch_to_asm+0x28 | | NOP1 |
82e9: __switch_to_asm+0x29 | NOP1 | callq 0x82ef <__switch_to_asm+0x2f> |
82ea: __switch_to_asm+0x2a | NOP1 | |
82eb: __switch_to_asm+0x2b | NOP1 | |
82ec: __switch_to_asm+0x2c | NOP1 | |
82ed: __switch_to_asm+0x2d | NOP1 | |
82ee: __switch_to_asm+0x2e | NOP1 | int3 | callq 0x82f4 <__switch_to_asm+0x34>
82ef: __switch_to_asm+0x2f | NOP1 | add $0x8,%rsp |
82f0: __switch_to_asm+0x30 | NOP1 | |
82f1: __switch_to_asm+0x31 | NOP1 | |
82f2: __switch_to_asm+0x32 | NOP1 | |
82f3: __switch_to_asm+0x33 | NOP1 | lfence | int3
82f4: __switch_to_asm+0x34 | NOP1 | | callq 0x82fa <__switch_to_asm+0x3a>
82f5: __switch_to_asm+0x35 | NOP1 | |
82f6: __switch_to_asm+0x36 | NOP1 | |
82f7: __switch_to_asm+0x37 | NOP1 | |
82f8: __switch_to_asm+0x38 | NOP1 | |
82f9: __switch_to_asm+0x39 | NOP1 | | int3
82fa: __switch_to_asm+0x3a | NOP1 | | add $0x10,%rsp
82fb: __switch_to_asm+0x3b | NOP1 | |
82fc: __switch_to_asm+0x3c | NOP1 | |
82fd: __switch_to_asm+0x3d | NOP1 | |
82fe: __switch_to_asm+0x3e | NOP1 | | dec %r12
82ff: __switch_to_asm+0x3f | NOP1 | |
8300: __switch_to_asm+0x40 | NOP1 | |
8301: __switch_to_asm+0x41 | NOP1 | | jne 0x82ee <__switch_to_asm+0x2e>
8302: __switch_to_asm+0x42 | NOP1 | |
8303: __switch_to_asm+0x43 | NOP1 | | lfence
8304: __switch_to_asm+0x44 | NOP1 | |
8305: __switch_to_asm+0x45 | NOP1 | |
8306: __switch_to_asm+0x46 | NOP1 | | movq $0xffffffffffffffff,%gs:0x0(%rip) # 0x20b <__x86_call_depth>
8307: __switch_to_asm+0x47 | NOP1 | |
8308: __switch_to_asm+0x48 | NOP1 | |
8309: __switch_to_asm+0x49 | NOP1 | |
830a: __switch_to_asm+0x4a | NOP1 | |
830b: __switch_to_asm+0x4b | NOP1 | |
830c: __switch_to_asm+0x4c | NOP1 | |
830d: __switch_to_asm+0x4d | NOP1 | |
830e: __switch_to_asm+0x4e | NOP1 | |
830f: __switch_to_asm+0x4f | NOP1 | |
8310: __switch_to_asm+0x50 | NOP1 | |
8311: __switch_to_asm+0x51 | NOP1 | |

I like this a lot, but I think it could be vertically compressed quite a
bit, and superfluous NOPs removed:

82e7: __switch_to_asm+0x27 | <alternative.82e7> | !X86_FEATURE_ALWAYS | X86_FEATURE_RSB_CTXSW
82e7: __switch_to_asm+0x27 | jmp 0x8312 <__switch_to_asm+0x52> | nop1 | mov $0x10,%r12
82e8: __switch_to_asm+0x28 | | nop1 |
82e9: __switch_to_asm+0x29 | | callq 0x82ef <__switch_to_asm+0x2f> |
82ee: __switch_to_asm+0x2e | | int3 | callq 0x82f4 <__switch_to_asm+0x34>
82ef: __switch_to_asm+0x2f | | add $0x8,%rsp |
82f3: __switch_to_asm+0x33 | | lfence | int3
82f4: __switch_to_asm+0x34 | | | callq 0x82fa <__switch_to_asm+0x3a>
82f9: __switch_to_asm+0x39 | | | int3
82fa: __switch_to_asm+0x3a | | | add $0x10,%rsp
82fe: __switch_to_asm+0x3e | | | dec %r12
8301: __switch_to_asm+0x41 | | | jne 0x82ee <__switch_to_asm+0x2e>
8303: __switch_to_asm+0x43 | | | lfence
8306: __switch_to_asm+0x46 | | | movq $0xffffffffffffffff,%gs:0x0(%rip) # 0x20b <__x86_call_depth>

That reads much nicer to me.


Yeah, better. I can easily do that by getting rid of trailing NOPs.

Thanks,

alex.