Re: x86 entry perf unwinding failure (missing IRET_REGS annotation on stack switch?)

From: Peter Zijlstra
Date: Tue Apr 28 2020 - 10:35:50 EST


On Tue, Apr 28, 2020 at 09:14:57AM -0500, Josh Poimboeuf wrote:
> On Tue, Apr 28, 2020 at 02:46:27PM +0200, Peter Zijlstra wrote:
> > > I'm thinking something like this should fix it. Peter, does this look
> > > ok?
> >
> > Unfortunate. But also, I fear, insufficient. Specifically consider
> > things like:
> >
> > ALTERNATIVE "jmp 1f",
> > "alt...
> > "..."
> > "...insn", X86_FEAT_foo
> > 1:
> >
> > This results in something like:
> >
> >
> > .text .altinstr_replacement
> > e8 xx ...
> > 90
> > 90
> > ...
> > 90
> >
> > Where all our normal single byte nops (0x90) are unreachable with
> > undefined CFI, but the alternative might have CFI, which is never
> > propagated.
> >
> > We ran into this with the validate_alternative stuff from Alexandre.
>
> I don't get what you're saying. We decided not to allow CFI changes in
> alternatives. And how does this relate to my patch?

Ah, I went with a slightly looser invariant rule that allows CFI but
ensures they're the same for all alternatives, and the above orig text
has a giant unreachable hole (that we don't report because NOP), I'm
allowing the alternative CFI changes in that.

Maybe that's too much leaway, but I'm thinking it ought to work.

> > > @@ -773,12 +772,26 @@ static int handle_group_alt(struct objtool_file *file,
> > > struct instruction *last_orig_insn, *last_new_insn, *insn, *fake_jump = NULL;
> > > unsigned long dest_off;
> > >
> > > + /*
> > > + * For uaccess checking, propagate the STAC/CLAC from the alternative
> > > + * to the original insn to avoid paths where we see the STAC but then
> > > + * take the NOP instead of CLAC (and vice versa).
> > > + */
> > > + if (!orig_insn->ignore_alts && orig_insn->type == INSN_NOP &&
> > > + *new_insn &&
> > > + ((*new_insn)->type == INSN_STAC ||
> > > + (*new_insn)->type == INSN_CLAC))
> > > + orig_insn->type = (*new_insn)->type;
> >
> > Also, this generates a mis-match between actual instruction text and
> > type. We now have a single byte instruction (0x90) with the type of a 3
> > byte (SLAC/CLAC). Which currently isn't a problem, but I'm looking at
> > adding infrastructure for having objtool rewrite instructions.
>
> But it doesn't actually change the original instruction bytes, it just
> changes the decoding. Is that really going to be a problem? We do that
> in other places as well, and it helps simplify code flow.

It will probably work just fine, it just feels off to me.

> Also might I ask why you're going to be rewriting instructions? That
> sounds scary.

Variable length jump label support, I can't make gnu-as (I so hate that
thing) emit the right instruction at compile-time :/

> > So rather than hacking around this issue, should we not make
> > create_orc() smarter?
>
> Maybe, though I don't see how that logic belongs in create_orc(). It
> might be tricky distinguishing between normal undefined and "undefined
> because of a skip_orig". Right now create_orc() is blissfully ignorant.

Yeah, you're right. I'll look for a better place to stick it. Perhaps I
can frob it in validate_branch() somewhere.

And you're also right on the unreachable because of skip_orig thing,
I'll thnk about that.