Re: [PATCH v4 01/13] objtool: Remove CFI save/restore special case

From: Peter Zijlstra
Date: Mon Mar 30 2020 - 16:03:08 EST


On Mon, Mar 30, 2020 at 02:02:05PM -0500, Josh Poimboeuf wrote:
> On Mon, Mar 30, 2020 at 07:02:00PM +0200, Peter Zijlstra wrote:
> > Subject: objtool: Implement RET_TAIL hint
> >
> > This replaces the SAVE/RESTORE hints with a RET_TAIL hint that applies to:
> >
> > - regular RETURN and sibling calls (which are also function exists)
> > it allows the stack-frame to be off by one word, ie. it allows a
> > return-tail-call.
> >
> > - EXCEPTION_RETURN (a new INSN_type that splits IRET out of
> > CONTEXT_SWITCH) and here it denotes a return to self by having it
> > consume arch_exception_frame_size bytes off the stack and continuing.
> >
> > Apply this hint to ftrace_64.S and sync_core(), the two existing users
> > of the SAVE/RESTORE hints.
> >
> > For ftrace_64.S we split the return path and make sure the
> > ftrace_epilogue call is seen as a sibling/tail-call turning it into it's
> > own function.
> >
> > By splitting the return path every instruction has a unique stack setup
> > and ORC can generate correct unwinds (XXX check if/how the ftrace
> > trampolines map into the ORC). Then employ the RET_TAIL hint to the
> > tail-call exit that has the direct-call (orig_eax) return-tail-call on.
> >
> > For sync_core() annotate the IRET with RET_TAIL to mark it as a
> > control-flow NOP that consumes the exception frame.
>
> I do like the idea to get rid of SAVE/RESTORE altogether. And it's nice
> to make that ftrace code unwinder-deterministic.
>
> However sync_core() and ftrace_regs_caller() are very different from
> each other and I find the RET_TAIL hint usage to be extremely confusing.

I was going with the pattern:

push target
ret

which is an indirect tail-call that doesn't need a register. We use it
in various places. We use it here exactly because it preserves all
registers, but we use it in function-graph tracer and retprobes to
insert the return handler. But also in retpoline, because it uses the
return stack predictor, which by happy accident isn't the indirect
branch predictor.

> For example, IRETQ isn't even a tail cail.

It's the same indirect call, except with a bigger frame ;-)

push # ss
push # rsp
push # flags
push # cs
push # ip
iret

> And the need for the hint to come *before* the insn which changes the
> state is different from the other hints.

makes sense to me... but yah.

> And now objtool has to know the arch exception stack size because of a
> single code site.

Agreed.

> And for a proper tail call, the stack should be empty.

All depends what you call proper :-)

> I don't
> understand the +8 thing in has_modified_stack_frame().

push target
ret

means we hit ret with one extra word on the stack.

> It seems
> hard-coded for the weird ftrace case, rather than for tail calls in
> general (which should already work as designed).

Like I said, we have it all over the place, but I suspect they're all
mostly hidden from objtool.

> How about a more general hint like UNWIND_HINT_ADJUST?
>
> For sync_core(), after the IRETQ:
>
> UNWIND_HINT_ADJUST sp_add=40
>
> And ftrace_regs_caller_ret could have:
>
> UNWIND_HINT_ADJUST sp_add=8

I like, I'll make it happen in the morning.