Re: [PATCH 4/8] arm64: Basic Branch Target Identification support

From: Dave P Martin
Date: Tue May 28 2019 - 06:55:30 EST


On Fri, May 24, 2019 at 06:19:10PM +0100, Mark Rutland wrote:
> On Fri, May 24, 2019 at 05:12:40PM +0100, Dave Martin wrote:
> > On Fri, May 24, 2019 at 04:38:48PM +0100, Mark Rutland wrote:
> > > On Fri, May 24, 2019 at 03:53:06PM +0100, Dave Martin wrote:
> > > > On Fri, May 24, 2019 at 02:02:17PM +0100, Mark Rutland wrote:
> > > > > On Fri, May 24, 2019 at 11:25:29AM +0100, Dave Martin wrote:

[...]

> > > > > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > > > > > index 5610ac0..85b456b 100644
> > > > > > --- a/arch/arm64/kernel/syscall.c
> > > > > > +++ b/arch/arm64/kernel/syscall.c
> > > > > > @@ -66,6 +66,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> > > > > > unsigned long flags = current_thread_info()->flags;
> > > > > >
> > > > > > regs->orig_x0 = regs->regs[0];
> > > > > > +regs->pstate &= ~(regs->pstate & PSR_BTYPE_MASK);
> > > > >
> > > > > Likewise:
> > > > >
> > > > > regs->pstate &= ~PSR_BTYPE_MASK;
> > > > >
> > > > > ... though I don't understand why that would matter to syscalls, nor how
> > > > > those bits could ever be set given we had to execute an SVC to get here.
> > > > >
> > > > > What am I missing?
> > > >
> > > > The behaviour is counterintuivite here. The architecture guarantees to
> > > > preserve BTYPE for traps, faults and asynchronous exceptions, but for a
> > > > synchronous execption from normal architectural execution of an
> > > > exception-generating instruction (SVC/HVC/SMC) the architecture leaves
> > > > it IMP DEF whether BTYPE is preserved or zeroed in SPSR.
> > >
> > > I'm still missing something here. IIUC were BTYPE was non-zero, we
> > > should take the BTI trap before executing the SVC/HVC/SMC, right?
> > >
> > > Otherwise, it would be possible to erroneously branch to an SVC/HVC/SMC,
> > > which would logically violate the BTI protection.
> >
> > Only if the SVC (etc.) is in a BTI guarded page. Otherwise, we could
> > have legitimately branched to the SVC insn directly and BTYPE would
> > be nonzero, but no trap would occur.
>
> I agree that would be the case immediately before we execute the SVC,
> but I think there's a subtlety here w.r.t. what exactly happens as an
> SVC is executed.
>
> My understanding was that even for unguarded pages, the execution of any
> (non branch/eret) instruction would zero PSTATE.BTYPE.
>
> For SVC it's not clear to me whether generating the SVC exception is
> considered to be an effect of completing the execution of an SVC,
> whether it's considered as preempting the execution of the SVC, or
> whether that's IMPLEMENTATION DEFINED.
>
> Consequently it's not clear to me whether or not executing an SVC clears
> PSTATE.BTYPE before the act of taking the exception samples PSTATE. I
> would hope that it does, as this would be in keeping with the way the
> ELR is updated.

OTOH, the wording calls this case out quite explicitly. It seems odd to
do that if the more general wording applies.

I'll take another look and request clarficiation.

> I think that we should try to clarify that before we commit ourselves to
> the most painful interpretation here. Especially as similar would apply
> to HVC and SMC, and I strongly suspect firmware in general is unlikely
> to fix up the PSTATE.BTYPE of a caller.
>
> > We should still logically zero BTYPE across SVC in that case, because
> > the SVC may itself branch: a signal could be delivered on return and
> > we want the prevailing BTYPE then to be 0 for capture in the signal
> > frame. Doing this zeroing in signal delivery because if the signal
> > is not delivered in SVE then a nonzero BTYPE might be live, and we
> > must then restore it properly on sigreturn.
>
> I'm not sure I follow this.
>
> If we deliver a signal, the kernel generates a pristine PSTATE for the
> signal handler, and the interrupted context doesn't matter.
>
> Saving/restoring the state of the interrupted context is identical to
> returning without delivering the signal, and we'd have a problem
> regardless.

My test looks garbled... since the point I was making was tangential, I
don't elaborate it for now.

> > As you observe, this scenario should be impossible if the SVC insn
> > is in a guarded page, unless userspace does something foolhardy like
> > catching the SIGILL and fudging BTYPE or the return address.
>
> I think userspace gets to pick up the pieces in this case. Much like
> signal delivery, it would need to generate a sensible PSTATE itself.

Agreed, there is no way to hide this kind of thing from userspace code
that messes with the signal frame -- so we shouldn't try.

> [...]
>
> > (Now I come to think of it I also need to look at rseq etc., which is
> > another magic kernel-mediated branch mechanism.)

Meh.

Cheers
---Dave
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.