Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()

From: Peter Zijlstra
Date: Tue Sep 12 2023 - 05:45:04 EST


On Tue, Sep 12, 2023 at 11:27:09AM +0200, Peter Zijlstra wrote:
> On Sun, Sep 10, 2023 at 04:42:27PM +0200, Borislav Petkov wrote:
> > On Sat, Sep 09, 2023 at 11:25:54AM +0200, Peter Zijlstra wrote:
> > > So what you end up with is:
> > >
> > > 661:
> > > "one byte orig insn"
> > > "one nop because alt1 is 2 bytes"
> > > "one nop because alt2 is 3 bytes"
> > >
> > > right?
> >
> > Right.
> >
> > > This becomes more of a problem with your example above where the
> > > respective lengths are 0, 5, 16. In that case, when you patch 5, you'll
> > > leave 11 single nops in there.
> >
> > Well, I know what you mean but the code handles that gracefully and it
> > works. Watch this:
>
> Aah, because we run optimize_nops() for all alternatives, irrespective
> of it being selected. And thus also for the longest and then that'll fix
> things up.
>
> OK, let me check on objtool.

OK, I think objtool really does need the hunk you took out.

The problem there is that we're having to create ORC data that is valid
for all possible alternatives -- there is only one ORC table (unless we
go dynamically patch the ORC table too, but so far we've managed to
avoid doing that).

The constraint we have is that for every address the ORC data must match
between the alternatives, but because x86 is a variable length
instruction encoding we can (and do) play games. As long as the
instruction addresses do not line up, they can have different ORC data.

One place where this matters is the tail, if we consider this a string
of single byte nops, that forces a bunch of ORC state to match. So what
we do is that we assume the tail is a single large NOP, this way we get
minimal overlap / ORC conflicts.

As such, we need to know the max length when constructing the
alternatives, otherwise you get short alternatives jumping to somewhere
in the middle of the actual range and well, see above.