Re: [PATCH v2 1/3] x86/uprobes: Fix not using prefixes.nbytes for loop over prefixes.bytes

From: Masami Hiramatsu
Date: Fri Dec 04 2020 - 06:29:44 EST


On Fri, 4 Dec 2020 12:06:44 +0100
Borislav Petkov <bp@xxxxxxxxx> wrote:

> On Fri, Dec 04, 2020 at 09:56:53AM +0900, Masami Hiramatsu wrote:
> > Hmm, there is a difference between Intel SDM and AMD APM.
> >
> > Intel SDM vol.2
> >
> > 2.1.1 Instruction Prefixes
> > Instruction prefixes are divided into four groups, each with a set of allowable prefix codes. For each instruction, it
> > is only useful to include up to one prefix code from each of the four groups (Groups 1, 2, 3, 4).
> >
> > AMD APM vol.3
> >
> > 1.2.1 Summary of Legacy Prefixes
> > Table 1-1 on page 7 shows the legacy prefixes. The legacy prefixes are organized into five groups, as
> > shown in the left-most column of Table 1-1. An instruction encoding may include a maximum of one
> > prefix from each of the five groups.
> >
> > So, Intel CPU doesn't accept LOCK-REP because those are in a same prefix
> > group, but AMD says it is acceptable.
>
> That would be a huge problem for code if both vendors would behave
> differently wrt prefixes.
>
> > Actually, insn.c only accepts the prefix up to 4, so if there is any
> > instruction which has 5 prefixes, it will fail to parse.
>
> Well, actually it looks more like a difference in how both vendors group
> things:
>
> AMD has 5 groups and Intel 4 by putting LOCK and REP together.
>
> The most important aspect, however, is that you can have as many
> prefixes as you want and there's no hardware limitation on the number -
> I'm being told - just that you can overflow the instruction limit of 15
> and then get a #GP for invalid insn. See here:
>
> https://sandpile.org/x86/opc_enc.htm
>
> note #1
>
> with examples how you can overflow the 15 bytes limit even with a valid
> insn.
>
> > Note that anyway the same prefix can be repeated, we can see a good example
> > in K8_NOP*.
>
> Yap.
>
> > In this case, insn.c just store the 1 osp in the prefixes.bytes[], and
> > just increment prefixes.nbytes for the repeated prefixes.
> >
> > Anyway, if there is LOCK-REP prefix combination, I have to introduce new
> > insn_field for legacy prefix.
>
> Well, the legacy prefixes field needs to be of 4 fields because REP and
> LOCK really are two separate but mutually exclusive groups. Why?
>
> They're used by a disjoint set of instructions, see the AMD doc for both
> REP and LOCK prefixes.
>
> Which means, you can either have a REP (exclusive or) LOCK but not both.

Yeah, I found that. So I think the "max number of legacy groups on one
instruction" is 4.

> Which means, as a stable@ fix I can use Tom's ARRAY_SIZE() suggestion
> and then later on we can make the legacy prefixes a separate struct.
> Maybe even a struct with a bitfield:

Sorry, but I don't think we need such optimization. It seems over-
optimized the code for me. Moreover, the last-prefix is meaningful
for switching the opcode, so we need to keep it.

Thank you,


>
> struct legacy_prefixes {
> /* operand-size override: 0x66 */
> u8 os_over: 1,
> /* address-size override: 0x67 */
> as_over: 1,
> /*
> * segment override: 0x2e(CS), 0x3e(DS), 0x26(ES), 0x64(FS), 0x65(GS),
> * 0x36(SS)
> */
> s_over: 1,
> /* lock prefix: 0xf0 */
> lock: 1,
> /* repeat prefixes: 0xf2: REPNx, 0xf3: REPx */
> rep: 1,
> __resv: 3;
> };
>
> or so which you can set to denote when you've seen the respective
> prefixes.
>
> But that we can discuss later.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>