Re: [PATCH 13/14] x86: BHI stubs
From: Peter Zijlstra
Date: Tue Oct 01 2024 - 07:04:38 EST
On Mon, Sep 30, 2024 at 03:38:48PM -0700, Josh Poimboeuf wrote:
> On Mon, Sep 30, 2024 at 11:23:38PM +0100, Andrew Cooper wrote:
> > On 30/09/2024 10:30 pm, Josh Poimboeuf wrote:
> > > On Fri, Sep 27, 2024 at 09:49:09PM +0200, Peter Zijlstra wrote:
> > >> +SYM_INNER_LABEL(__bhi_args_0, SYM_L_LOCAL)
> > >> + UNWIND_HINT_FUNC
> > >> + cmovne %r10, %rdi
> > > IIUC, this works because if the "jz" in the CFI preamble mispredicts to
> > > the __bhi_args_* code, "cmovne" will zero out the speculative value of
> > > rdi.
> > >
> > > Why use %r10 instead of a literal $0? Also how do you know %r10 is 0?
> >
> > There's no encoding for CMOVcc which takes an $imm.
>
> Ah.
>
> > %r10 is guaranteed zero after the FineIBT prologue
>
> If the "jz" in the FineIBT prologue mispredicts, isn't %r10 non-zero by
> definition?
Since I just wrote the comment...
* FineIBT-BHI:
*
* __cfi_foo:
* endbr64
* subl 0x12345678, %r10d
* jz foo-1
* ud2
* foo-1:
* call __bhi_args_XXX
* foo+4:
* ... code here ...
* ret
*
* direct caller:
* call foo+4
*
* indirect caller:
* lea foo(%rip), %r11
* ...
* movl $0x12345678, %r10d
* subl $16, %r11
* nop4
* call *%r11
And lets take a random bhi function:
+ .align 16
+SYM_INNER_LABEL(__bhi_args_0_1, SYM_L_LOCAL)
+ UNWIND_HINT_FUNC
+ cmovne %r10, %rdi
+ cmovne %r10, %rsi
+ ANNOTATE_UNRET_SAFE
+ ret
+ int3
So the case you worry about is SUBL does *not* result in 0, but we
speculate JZ true and end up in CALL, and do CMOVne.
Since we speculated Z, we must then also not do the CMOV, so the value
of R10 is irrelevant, it will not be used. The thing however is that
CMOV will unconditionally put a store dependency on the target register
(RDI, RSI in the above sequence) and as such any further speculative
code trying to use those registers will stall.
> > , but I don't see
> > anything in patch 11 which makes this true in the !FineIBT case.
>
> I thought this code is only used by FineIBT?
Right, so I do have me a patch that adds it to regular KCFI as well, but
I dropped it for now, since I don't have a strong rationale for it and
it requires yet more compiler tinkering.