Re: [PATCH v8 12/28] x86/insn-eval: Add utility functions to get segment selector

From: Ricardo Neri
Date: Wed Sep 27 2017 - 18:32:46 EST


On Wed, 2017-09-27 at 13:47 +0200, Borislav Petkov wrote:
> On Tue, Sep 26, 2017 at 09:21:44PM -0700, Ricardo Neri wrote:
> >
> > This is true except when we don't have an insn at all (well, it may
> > be
> > non-NULL but it will only contain garbage). The case to which I am
> > referring is when we begin decoding our instruction. The first step
> > is
> > to copy_from_user the instruction and populate insn. For this we
> > must
> > calculate the linear address from where we copy using CS and rIP.
> Where do we do that?

UMIP emulation does it when evaluating if emulation is needed after a
#GP(0). It copy_from_user into insn the code at rIP that caused the
exception [1].
>
> >
> > Furthermore, in this only case we don't need to look at insn at all
> > as
> > the only register involved is rIP no segment override prefixes are
> > allowed.
> In any case, as it is now it sounds convoluted: you may or may not
> have an insn, and yet you call get_overridden_seg_reg() on it but you
> don't really need segment overrides because you only need CS and rIP
> initially.

The idea is thatÂget_overridden_seg_reg() would implement the logic you
just described. It would return return INAT_SEG_REG_DEFAULT/IGNORE when
segment override prefixes are not allowed (i.e., valid insn with
operand rDI and string instruction; and rIP) or needed (i.e., long
mode, except if there are override prefixes for FS or GS); or
INAT_SEG_REG_[CSDEFG]S otherwise.Â

Then resolve_seg_register() resolves the default segment if needed as
per the value returned by get_overridden_seg_reg().

Summarizing, a more accurate function name for the intended behavior is
get_overridden_seg_reg_if_any_or_needed().

> Sounds to me like this initial parsing should be done separately from
> this function...

I decided to put all the handling of segment override prefixes in a
single function.

Perhaps it could be split into two functions as follows(diff on top of
my original patches):

* Rename get_overridden_seg_reg top get_overridden_seg_reg_idx
* Remove from get_overridden_seg_reg_idx checks for rIP and rDI...
* Checks for rIP and rDI are done in a new function
* Now resolve_seg_reg calls the two functions above to determine if it
needs to resolve the default segment register index.

@@ -77,24 +77,12 @@ static bool is_string_insn(struct insn *insn)
 * INAT_SEG_REG_DEFAULT is returned if no segment override prefixes
were found
 * and the default segment register shall be used. -EINVAL in case of
error.
 */
-static int get_overridden_seg_reg(struct insn *insn, struct pt_regs
*regs,
- ÂÂint regoff)
+static int get_overridden_seg_reg_idx(struct insn *insn, struct
pt_regs *regs,
+ ÂÂÂÂÂÂint regoff)
Â{
 int idx = INAT_SEG_REG_DEFAULT;
 int sel_overrides = 0, i;
Â
- /*
- Â* Segment override prefixes should not be used for (E)IP.Â
- Â* Check this case first as we might not have (and not neededÂ
- Â* at all) a valid insn structure to evaluate segment
overrideÂ
- Â* prefixes.
- Â*/
- if (regoff == offsetof(struct pt_regs, ip)) {
- if (user_64bit_mode(regs))
- return INAT_SEG_REG_IGNORE;
- else
- return INAT_SEG_REG_DEFAULT;
- }
-
 if (!insn)
 return -EINVAL;
Â
@@ -145,18 +133,32 @@ static int get_overridden_seg_reg(struct insn
*insn, struct pt_regs *regs,
 /*
Â* More than one segment override prefix leads to undefinedÂ
Â* behavior.
Â*/
 } else if (sel_overrides > 1) {
 return -EINVAL;
- /*
- Â* Segment override prefixes are always ignored for stringÂ
- Â* instructions
- Â* that involve the use the (E)DI register.
- Â*/
- } else if ((regoff == offsetof(struct pt_regs, di)) &&
- ÂÂÂis_string_insn(insn)) {
- return INAT_SEG_REG_DEFAULT;
 }
Â
 return idx;
Â}
Â
+static int use_seg_reg_overrides(struct insn *insn, int regoff)
+{
+ /*
+ Â* Segment override prefixes should not be used for rIP.
CheckÂ
+ Â* this case first as we might not have (and not needed at
all) + Â* a valid insn structure to evaluate segment overrideÂ
+ Â* prefixes.
+ Â*/
+ if (regoff == offsetof(struct pt_regs, ip))
+ return 0;
+
+ /* Subsequent checks require a valid insn. */
+ if (!insn)
+ return -EINVAL;
+
+ if ((regoff == offsetof(struct pt_regs, di)) &&
+ ÂÂÂis_string_insn(insn))
+ return 0;
+
+ return 1;
+}
+
Â/**
 * resolve_seg_register() - obtain segment register
 * @insn: Instruction structure with segment override prefixes
@@ -179,22 +181,20 @@ static int get_overridden_seg_reg(struct insn
*insn, struct pt_regs *regs,
 */
Âstatic int resolve_seg_reg(struct insn *insn, struct pt_regs *regs,
int regoff)
Â{
- int idx;
-
- idx = get_overridden_seg_reg(insn, regs, regoff);
+ int use_pfx_overrides;
Â
- if (idx < 0)
- return idx;
-
- if (idx == INAT_SEG_REG_IGNORE)
- return idx;
+ use_pfx_overrides = use_seg_reg_overrides(insn, regoff);
+ if (use_pfx_overrides < 0)
+ return -EINVAL;
Â
- if (idx != INAT_SEG_REG_DEFAULT)
- return idx;
+ if (use_pfx_overrides == 0)
+ goto resolve_default_idx;
Â
- if (!insn)
- return -EINVAL;
+ return get_overridden_seg_reg_idx(insn, regs, regoff);
Â
+resolve_default_idx:
+ if (user_64bit_mode(regs))
+ return INAT_SEG_REG_IGNORE;
 /*
 Â* If we are here, we use the default segment register asÂ
Â* described in the Intel documentation:
@@ -209,6 +209,9 @@ static int resolve_seg_reg(struct insn *insn,
struct pt_regs *regs, int regoff)
 Â*ÂÂ+ CS for (E)IP.
 Â*/
Â
+ if (!insn)
+ return -EINVAL;
+
 switch (regoff) {
 case offsetof(struct pt_regs, ax):
 case offsetof(struct pt_regs, cx):

Does this make sense?

>
> >
> > I only used "(E)" (i.e., not the "(R|)" part) as these utility
> > functions will deal mostly with protected mode, unless FS or GS are
> > used in long mode.
> eIP or rIP is simply much easier to type and parse. Those brackets,
> not
> really.

Agreed. Then I will use rIP.
>
> >
> > I only check for a NULL insn when needed (i.e., the contents of the
> > instruction could change the used segment register).
> ... and those if (!insn) tests sprinkled around simply make the code
> unreadable and if we can get rid of them, we should.

Sure, you are correct this will make code more readable.

Thanks and BR,
Ricardo

[1].Âhttps://github.com/ricardon/tip/blob/rneri/umip_v9/arch/x86/kernel
/umip.c#L276