Re: [PATCH v2 1/3] x86/uprobes: Fix not using prefixes.nbytes for loop over prefixes.bytes

From: Masami Hiramatsu
Date: Thu Dec 03 2020 - 19:57:41 EST


On Thu, 3 Dec 2020 12:49:46 -0600
Tom Lendacky <thomas.lendacky@xxxxxxx> wrote:

> On 12/3/20 12:17 PM, Borislav Petkov wrote:
> > On Thu, Dec 03, 2020 at 12:10:10PM -0600, Tom Lendacky wrote:
> >> Since that struct is used in multiple places, I think basing it on the array
> >> size is the best way to go. The main point of the check is just to be sure
> >> you don't read outside of the array.
> >
> > Well, what happens if someone increases the array size of:
> >
> > struct insn_field {
> > union {
> > insn_byte_t bytes[4];
> > ^^^^
> >
> > ?
>
> I think we need to keep the parsing of the instruction separate from
> accessing the prefixes after (successfully) parsing it. This fix is merely
> making sure that we don't read outside the bounds of the array that
> currently holds the legacy prefixes.
>
> >
> > That's why a separate array only for legacy prefixes would be better
> > in the long run. The array size check is good as a short-term fix for
> > stable.
> >
> > I'd say.
>
> According to Volume 3 of the AMD APM (Figure 1-2 on page 5), there could
> be as many as 5 legacy prefixes and it says that more than one prefix from
> each group is undefined behavior. The instruction parsing code doesn't
> seem to take into account the different prefix groups. So I agree with you
> that short term the array size check works, and long term, the legacy
> prefix support probably needs a closer look.

Hmm, there is a difference between Intel SDM and AMD APM.

Intel SDM vol.2

2.1.1 Instruction Prefixes
Instruction prefixes are divided into four groups, each with a set of allowable prefix codes. For each instruction, it
is only useful to include up to one prefix code from each of the four groups (Groups 1, 2, 3, 4).

AMD APM vol.3

1.2.1 Summary of Legacy Prefixes
Table 1-1 on page 7 shows the legacy prefixes. The legacy prefixes are organized into five groups, as
shown in the left-most column of Table 1-1. An instruction encoding may include a maximum of one
prefix from each of the five groups.

So, Intel CPU doesn't accept LOCK-REP because those are in a same prefix
group, but AMD says it is acceptable. Actually, insn.c only accepts the
prefix up to 4, so if there is any instruction which has 5 prefixes,
it will fail to parse.

Note that anyway the same prefix can be repeated, we can see a good example
in K8_NOP*.

/* Opteron 64bit nops
1: nop
2: osp nop
3: osp osp nop
4: osp osp osp nop
*/

In this case, insn.c just store the 1 osp in the prefixes.bytes[], and
just increment prefixes.nbytes for the repeated prefixes.

Anyway, if there is LOCK-REP prefix combination, I have to introduce new
insn_field for legacy prefix.

Thank you,

--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>