Re: [PATCH] x86/kcfi: Optimize call sequence
From: David Laight
Date: Wed Jun 17 2026 - 05:32:03 EST
On Wed, 17 Jun 2026 09:08:13 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Tue, Jun 16, 2026 at 09:47:22PM +0100, David Laight wrote:
>
> > > --- a/arch/x86/kernel/alternative.c
> > > +++ b/arch/x86/kernel/alternative.c
> > > @@ -1356,6 +1356,10 @@ early_param("cfi", cfi_parse_cmdline);
> > > * "Make conditional jumps most often not taken: The efficiency and throughput
> > > * for not-taken branches is better than for taken branches on most
> > > * processors. Therefore, it is good to place the most frequent branch first"
> > > + *
> > > + * NOTE: Update the kCFI caller sequence to make use of this observation.
> > > + * Replace the "je 1f; ud2" sequence with "jne +1; test $0xd6, %al". This
> > > + * clobbers flags, but those are clobbered by the hash test anyway.
> >
> > I think it would be better to give the byte sequences for both pairs of
> > instructions - it takes a bit of sleuthing to check they are the same size.
>
> You mean, expand the comment like a few lines above, where we have the
> kCFI/FineIBT contrast? Sure, I suppose I can make this comment longer
> still.
More detail and less waffle :-)
I had to read the earlier comment several times because it mentions using
udb and then gives a code snippet that contains ud2.
I then had to check the instruction encodings for both (and neither in is
the 286 and 386 books on my desk).
Just adding (0f,0b) after one of the ud2 and (d6) after a udb would help.
> > I think it would also be better it the code doing the patching checked
> > what it was overwriting.
>
> Ye of little faith :-)
I wouldn't want to have to debug the consequences of getting it wrong.
(The same goes for patching into function preamble.)
My 'little faith' comes from patching live kernel code with echo | dd :-)
>
> > Also, what actually generates the list of cfi locations in the first place?
> > If it is objtool, then maybe it could do the rewrite instead.
>
> The list with UD2 locations is compiler generated.
I've never trusted compilers not to change their minds on how code
will be compiled.
> Also, objtool
> typically avoids actually modifying code and generally prefers to just
> ship additional sections such that the kernel can modify itself. There
> is an exception to this, but there was definite grumbling about that.
At least this one is an optimisation.
The advantage of getting objtool to do the change is that objdump will
then show the code that is being executed.
David