Re: [PATCH] x86/kcfi: Optimize call sequence

From: Peter Zijlstra

Date: Wed Jun 17 2026 - 07:25:37 EST


On Wed, Jun 17, 2026 at 10:26:43AM +0100, David Laight wrote:

> > > I think it would also be better it the code doing the patching checked
> > > what it was overwriting.
> >
> > Ye of little faith :-)
>
> I wouldn't want to have to debug the consequences of getting it wrong.
> (The same goes for patching into function preamble.)

Been there, done that etc. :-) I'm the weirdo that's written all this
code.

> My 'little faith' comes from patching live kernel code with echo | dd :-)

The thing is, objtool validates the retpolines are preceded by UD2 as
marker for kCFI and complains when this is not so (there must not be
unannotated indirect calls). And the code that is patching is already
checking there is that mov into %r10d at the expected offset.

The update poke happens when both those are true; (leading mov and
trailing UD2), verifying things again has very little added value.

> > > Also, what actually generates the list of cfi locations in the first place?
> > > If it is objtool, then maybe it could do the rewrite instead.
> >
> > The list with UD2 locations is compiler generated.
>
> I've never trusted compilers not to change their minds on how code
> will be compiled.

The whole kCFI sequence and placement is ABI, there is no changing that.
It is a very specific sequence that is guaranteed to be attached to
any indirect call/jmp/retpoline.

> > Also, objtool
> > typically avoids actually modifying code and generally prefers to just
> > ship additional sections such that the kernel can modify itself. There
> > is an exception to this, but there was definite grumbling about that.
>
> At least this one is an optimisation.
> The advantage of getting objtool to do the change is that objdump will
> then show the code that is being executed.

Given the amount of self modifying code, that's a dream. Also, on
anything half recent from Intel, it'll all be rewritten to FineIBT,
which is wildly different from what objdump will be showing you.

The only way to truly see what's running is to disassemble the live
image -- either through /proc/kcore or some virtual machine gdb server.