Re: [PATCH v7 16/26] x86/insn-eval: Support both signed 32-bit and 64-bit effective addresses

From: Ricardo Neri
Date: Tue Jul 25 2017 - 19:48:21 EST


I am sorry Boris, while working on this series I missed a few of your
feedback comments.

On Wed, 2017-06-07 at 17:48 +0200, Borislav Petkov wrote:
> On Fri, May 05, 2017 at 11:17:14AM -0700, Ricardo Neri wrote:
> > The 32-bit and 64-bit address encodings are identical. This means that we
> > can use the same function in both cases. In order to reuse the function
> > for 32-bit address encodings, we must sign-extend our 32-bit signed
> > operands to 64-bit signed variables (only for 64-bit builds). To decide on
> > whether sign extension is needed, we rely on the address size as given by
> > the instruction structure.
> >
> > Once the effective address has been computed, a special verification is
> > needed for 32-bit processes. If running on a 64-bit kernel, such processes
> > can address up to 4GB of memory. Hence, for instance, an effective
> > address of 0xffff1234 would be misinterpreted as 0xffffffffffff1234 due to
> > the sign extension mentioned above. For this reason, the 4 must be
>
> Which 4?

I meant to say the 4 most significant bytes. In this case, the
64-address 0xffffffffffff1234 would lie in the kernel memory while
0xffff1234 would correctly be in the user space memory.
>
> > truncated to obtain the true effective address.
> >
> > Lastly, before computing the linear address, we verify that the effective
> > address is within the limits of the segment. The check is kept for long
> > mode because in such a case the limit is set to -1L. This is the largest
> > unsigned number possible. This is equivalent to a limit-less segment.
> >
> > Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> > Cc: Adam Buchbinder <adam.buchbinder@xxxxxxxxx>
> > Cc: Colin Ian King <colin.king@xxxxxxxxxxxxx>
> > Cc: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
> > Cc: Qiaowei Ren <qiaowei.ren@xxxxxxxxx>
> > Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> > Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> > Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> > Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> > Cc: Thomas Garnier <thgarnie@xxxxxxxxxx>
> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Cc: Borislav Petkov <bp@xxxxxxx>
> > Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> > Cc: Ravi V. Shankar <ravi.v.shankar@xxxxxxxxx>
> > Cc: x86@xxxxxxxxxx
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
> > ---
> > arch/x86/lib/insn-eval.c | 99 ++++++++++++++++++++++++++++++++++++++++++------
> > 1 file changed, 88 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> > index 1a5f5a6..c7c1239 100644
> > --- a/arch/x86/lib/insn-eval.c
> > +++ b/arch/x86/lib/insn-eval.c
> > @@ -688,6 +688,62 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
> > return get_reg_offset(insn, regs, REG_TYPE_RM);
> > }
> >
> > +/**
> > + * _to_signed_long() - Cast an unsigned long into signed long
> > + * @val A 32-bit or 64-bit unsigned long
> > + * @long_bytes The number of bytes used to represent a long number
> > + * @out The casted signed long
> > + *
> > + * Return: A signed long of either 32 or 64 bits, as per the build configuration
> > + * of the kernel.
> > + */
> > +static int _to_signed_long(unsigned long val, int long_bytes, long *out)
> > +{
> > + if (!out)
> > + return -EINVAL;
> > +
> > +#ifdef CONFIG_X86_64
> > + if (long_bytes == 4) {
> > + /* higher bytes should all be zero */
> > + if (val & ~0xffffffff)
> > + return -EINVAL;
> > +
> > + /* sign-extend to a 64-bit long */
>
> So this is a 32-bit userspace on a 64-bit kernel, right?

Yes.
>
> If so, how can a memory offset be > 32-bits and we have to extend it to
> a 64-bit long?!?

Yes, perhaps the check above is not needed. I included that check as
part of my argument validation. In a 64-bit kernel, this function could
be called with val with non-zero most significant bytes.
>
> I *think* you want to say that you want to convert it to long so that
> you can do the calculation in longs.

That is exactly what I meant. More specifically, I want to convert my
32-bit variables into 64-bit signed longs; this is the reason I need the
sign extension.
>
> However!
>
> If you're a 64-bit kernel running a 32-bit userspace, you need to do
> the calculation in 32-bits only so that it overflows, as it would do
> on 32-bit hardware. IOW, the clamping to 32-bits at the end is not
> something you wanna do but actually let it wrap if it overflows.

I have looked into this closely and as far as I can see, the 4 least
significant bytes will wrap around when using 64-bit signed numbers as
they would when using 32-bit signed numbers. For instance, for two
positive numbers we have:

7fff:ffff + 7000:0000 = efff:ffff.

The addition above overflows. When sign-extended to 64-bit numbers we
would have:

0000:0000:7fff:ffff + 0000:0000:7000:0000 = 0000:0000:efff:ffff.

The addition above does not overflow. However, the 4 least significant
bytes overflow as we expect. We can clamp the 4 most significant bytes.

For a two's complement negative numbers we can have:

ffff:ffff + 8000:0000 = 7fff:ffff with a carry flag.

The addition above overflows.

When sign-extending to 64-bit numbers we would have:

ffff:ffff:ffff:ffff + ffff:ffff:8000:0000 = ffff:ffff:7fff:ffff with a
carry flag.

The addition above does not overflow. However, the 4 least significant
bytes overflew and wrapped around as they would when using 32-bit signed
numbers.

> Or am I missing something?

Now, am I missing something?

Thanks and BR,
Ricardo