Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()

From: Peter Zijlstra
Date: Wed Sep 13 2023 - 04:47:29 EST


On Wed, Sep 13, 2023 at 06:37:38AM +0200, Borislav Petkov wrote:
> On Tue, Sep 12, 2023 at 11:44:41AM +0200, Peter Zijlstra wrote:
> > OK, I think objtool really does need the hunk you took out.
> >
> > The problem there is that we're having to create ORC data that is valid
> > for all possible alternatives -- there is only one ORC table (unless we
> > go dynamically patch the ORC table too, but so far we've managed to
> > avoid doing that).
> >
> > The constraint we have is that for every address the ORC data must match
> > between the alternatives, but because x86 is a variable length
> > instruction encoding we can (and do) play games. As long as the
> > instruction addresses do not line up, they can have different ORC data.
> >
> > One place where this matters is the tail, if we consider this a string
> > of single byte nops, that forces a bunch of ORC state to match. So what
> > we do is that we assume the tail is a single large NOP, this way we get
> > minimal overlap / ORC conflicts.
> >
> > As such, we need to know the max length when constructing the
> > alternatives, otherwise you get short alternatives jumping to somewhere
> > in the middle of the actual range and well, see above.
>
> Lemme make sure I understand this correctly. We have a three-way
> alternative in our example with the descrptors saying this:
>
> feat: 11*32+15, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a362b, len: 5)
> feat: 3*32+21, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a3630, len: 5)
> feat: 11*32+19, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 16), repl: (ffffffff833a3635, len: 16)
>
> i.e., the address to patch each time is ffffffff81c000d1, and the length
> is different - 5, 5 and 16.
>
> So that ORC data is tracking the starting address and the length?

No, ORC data tracks the address of every instruction that can possibly
exist in that range -- with the constraint that if two instructions have
the same address, the ORC data must match.

To reduce instruction edges in that range, we make sure the tail is a
single large instruction to the end of the alternative.

But since we now have variable length alternatives, we must search the
max length.

> I guess I don't fully understand the "middle of the actual range" thing
> because you don't really have a middle - you have the starting address
> and a length.

The alternative in the source location is of size max-length. Because
there must be room to patch in the longest alternative.

If you allow short alternatives you get:

CALL entry_untrain_ret
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop

Which is significantly different from:

CALL entry_untrain_ret
nop11

In that is has about 10 less ORC entries. But in order to build that
nop11 we must know the max size.

> Or are you saying that the differing length would cause ORC conflicts?

Yes, see above, the short alternative will want to continue at +5, but
we have a string of 1 byte nops there, and this will then constrain
things.

What objtool does/want is make then all of the same size so all the
tails are a single instruction to +16 so that we can disregard what is
in the actual tail.

We've gone over this multiple times already, also see commit
6c480f222128. That made sure the kernel side adhered to this scheme by
making the tail a single instruction irrespective of the length.