Re: [PATCH 2/2] x86/paravirt: add extra clobbers with ZERO_CALL_USED_REGS enabled

From: Bill Wendling
Date: Mon Sep 05 2022 - 02:04:12 EST


On Sat, Sep 3, 2022 at 12:18 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>
> On Fri, Sep 02, 2022 at 09:37:50PM +0000, Bill Wendling wrote:
> > [...]
> > callq *pv_ops+536(%rip)
>
> Do you know which pv_ops function is this? I can't figure out where
> pte_offset_kernel() gets converted into a pv_ops call....
>
This one is _paravirt_ident_64, I believe. I think that the original
issue Nathan was seeing was with another seemingly innocuous function.

> > [...]
> > --- a/arch/x86/include/asm/paravirt_types.h
> > +++ b/arch/x86/include/asm/paravirt_types.h
> > @@ -414,8 +414,17 @@ int paravirt_disable_iospace(void);
> > "=c" (__ecx)
> > #define PVOP_CALL_CLOBBERS PVOP_VCALL_CLOBBERS, "=a" (__eax)
> >
> > -/* void functions are still allowed [re]ax for scratch */
> > +/*
> > + * void functions are still allowed [re]ax for scratch.
> > + *
> > + * The ZERO_CALL_USED REGS feature may end up zeroing out callee-saved
> > + * registers. Make sure we model this with the appropriate clobbers.
> > + */
> > +#ifdef CONFIG_ZERO_CALL_USED_REGS
> > +#define PVOP_VCALLEE_CLOBBERS "=a" (__eax), PVOP_VCALL_CLOBBERS
> > +#else
> > #define PVOP_VCALLEE_CLOBBERS "=a" (__eax)
> > +#endif
> > #define PVOP_CALLEE_CLOBBERS PVOP_VCALLEE_CLOBBERS
>
> I don't think this should depend on CONFIG_ZERO_CALL_USED_REGS; it should
> always be present.
>
> I've only been looking at this just now, so many I'm missing
> something. The callee clobbers are for functions with return values,
> yes?
>
Kinda. It seems that the usage here is to let the compiler know that a
register may be modified by the callee, not just that it's an "actual"
return value. So it's suitable for void functions.

> For example, 32-bit has to manually deal with doing a 64-bit value return,
> and even got it wrong originally, fixing it in commit 0eb592dbba40
> ("x86/paravirt: return full 64-bit result"), with:
>
> -#define PVOP_VCALLEE_CLOBBERS "=a" (__eax)
> +#define PVOP_VCALLEE_CLOBBERS "=a" (__eax), "=d" (__edx)
>
> But the naming is confusing, since these aren't actually clobbers,
> they're input constraints marked as clobbers (the "=" modifier).
>
Right.

> Regardless, the note in the comments ...
>
> ...
> * However, x86_64 also have to clobber all caller saved registers, which
> * unfortunately, are quite a bit (r8 - r11)
> ...
>
> ... would indicate that ALL the function argument registers need to be
> marked as clobbers (i.e. the compiler can't figure this out on its own).
>
Good point. And there are some forms of these macros that specify
those as clobbers.

> I was going to say it seems like they're missing from EXTRA_CLOBBERS,
> but it's not used with any of the macros using PVOP_VCALLEE_CLOBBERS,
> and then I saw the weird alternatives patching that encodes the clobbers
> a second time (CLBR_ANY vs CLBR_RET_REG) via:
>
> #define _paravirt_alt(insn_string, type, clobber) \
> "771:\n\t" insn_string "\n" "772:\n" \
> ".pushsection .parainstructions,\"a\"\n" \
> _ASM_ALIGN "\n" \
> _ASM_PTR " 771b\n" \
> " .byte " type "\n" \
> " .byte 772b-771b\n" \
> " .short " clobber "\n" \
> ".popsection\n"
>
> And after reading the alternatives patching code which parses this via
> the following struct:
>
> /* These all sit in the .parainstructions section to tell us what to patch. */
> struct paravirt_patch_site {
> u8 *instr; /* original instructions */
> u8 type; /* type of this instruction */
> u8 len; /* length of original instruction */
> };
>
> ... I see it _doesn't use the clobbers_ at all! *head explode* I found
> that removal in commit 27876f3882fd ("x86/paravirt: Remove clobbers from
> struct paravirt_patch_site")
>
> So, I guess the CLBR_* can all be entirely removed. But back to my other
> train of thought...
>
[switches stations]

> It seems like all the input registers need to be explicitly listed in
> the PVOP_VCALLEE_CLOBBERS list (as you have), but likely should be done
> unconditionally and for 32-bit as well.
>
Possibly, though it may cause significant code degradation when the
compiler needs to store a value that's live over the ASM statement,
but the register it's in isn't actually modified. I saw that in the
example I gave in the description. In the case where a "movq" is used,
there's a useless move of "rdi" into "r11".

> (Also, please CC linux-hardening@xxxxxxxxxxxxxxx.)
>
Doh! Someday I'll learn email.

-bw