Re: [linux-next:master 14191/14955] vmlinux.o: error: objtool: amdgpu_vm_handle_fault+0x186: sibling call from callable instruction with modified stack frame
From: Mikhail Gavrilov
Date: Tue Jun 23 2026 - 16:58:02 EST
[+Josh, +Peter: objtool question below]
On Tue, Jun 23, 2026 at 8:17 PM kernel test robot <lkp@xxxxxxxxx> wrote:
> >> vmlinux.o: error: objtool: amdgpu_vm_handle_fault+0x186: sibling call from callable instruction with modified stack frame
I looked into this. It is an objtool false positive on a computed goto,
not a problem in the patch, and not specific to clang 22.1.3.
Config has CONFIG_LTO_CLANG_THIN=y, CONFIG_FRAME_POINTER=y, KASAN and
OBJTOOL_WERROR=y. objtool runs at the vmlinux.o link stage (per-TU
objects are LLVM bitcode under LTO, so the single amdgpu_vm.o never
reaches objtool). The robot hit this with clang 22.1.3; I reproduced it
on the same config with my system clang 22.1.8 (CONFIG_CLANG_VERSION=
220108), so it is not a 22.1.3-only codegen issue.
What +0x186 actually is (disasm of vmlinux.o, WERROR dropped so the
object survives):
17f: 48 c7 c0 00 00 00 00 mov $0x0,%rax
R_X86_64_32S .text.amdgpu_vm_handle_fault+0x196
186: ff e0 jmp *%rax
%rax is loaded via an R_X86_64_32S relocation with the address of label
.text.amdgpu_vm_handle_fault+0x196, and +0x196 is an unconditional jmp
back to +0x13e, the head of the second drm_exec_until_all_locked() loop.
This is the drm_exec_retry_on_contention() computed goto
(goto *__drm_exec_retry_ptr). There is a second identical pair at +0x194
-> +0x188 for the first loop. Both targets are inside the function; this
is not a tail call into another function. (svm_range_restore_pages() is
the inline stub here, CONFIG_HSA_AMD is not set, so that path is gone.)
So clang materialized the label address as mov $imm(reloc); jmp *%rax
instead of folding it into a direct jmp to the label. For the indirect
jmp objtool looks for a jump table in .rodata, finds none (this is a
single relocated label, not an indexed table), and falls back to
treating it as an indirect sibling call. The frame is already set up
(push %rbp at +0x5, sub $0x160,%rsp at +0x16), hence "sibling call with
modified stack frame". Runtime is fine, the jmp lands on the intended
in-function label; this is purely an objtool classification issue.
KASAN probably
tips the balance: the function is inflated with __asan_* checks
and shadow tests, and on that body clang keeps the label in a register
rather than folding it.
The drm_exec_until_all_locked() / drm_exec_retry_on_contention() macros
are used widely (amdgpu_cs, amdgpu_gem, etc.); the patch under bisect is
just the first to put such a loop into amdgpu_vm_handle_fault().
objtool question: should objtool resolve an indirect jmp whose target
register is loaded from a relocation pointing at a .text label inside
the same function, and treat it as an intra-function jump rather than a
sibling call? That would cover the labels-as-values / computed-goto
pattern that this and any future drm_exec user under LTO+KASAN will hit.
I am not respinning the series for this, the page-fault conversion is a
pure refactor with no manual stack work. Happy to test an objtool fix,
or to help with a drm_exec.h workaround if that is preferred over an
objtool change.
config: https://download.01.org/0day-ci/archive/20260623/202606232356.gwHMAJAW-lkp@xxxxxxxxx/config
compiler: clang 22.1.3 (robot report) and clang 22.1.8 (Fedora
22.1.8-4.fc45, my reproduction)
--
Best Regards,
Mikhail Gavrilov.