On 4/1/20 6:09 PM, Peter Zijlstra wrote:
On Wed, Apr 01, 2020 at 04:43:35PM +0100, Julien Thierry wrote:
+ÂÂÂÂÂÂÂ return true;
+
+ÂÂÂ if (state->stack_size != initial_func_cfi.cfa.offset + ret_offset)
ÂÂÂÂÂÂÂÂÂÂ return true;
-ÂÂÂ for (i = 0; i < CFI_NUM_REGS; i++)
+ÂÂÂ for (i = 0; i < CFI_NUM_REGS; i++) {
ÂÂÂÂÂÂÂÂÂÂ if (state->regs[i].base != initial_func_cfi.regs[i].base ||
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ state->regs[i].offset != initial_func_cfi.regs[i].offset)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return true;
+ÂÂÂ }
ÂÂÂÂÂÂ return false;
ÂÂ }
@@ -2185,6 +2148,13 @@ static int validate_branch(struct objtoo
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ break;
+ÂÂÂÂÂÂÂ case INSN_EXCEPTION_RETURN:
+ÂÂÂÂÂÂÂÂÂÂÂ if (func) {
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ state.stack_size -= arch_exception_frame_size;
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ break;
Why break instead of returning? Shouldn't an exception return mark the end
of a branch (whether inside or outside a function) ?
Here it seems it will continue to the next instruction which might have been
unreachable.
The code in question (x86's sync_core()), is an exception return to
self. It pushes an exception frame that points to right after the
exception return instruction.
This is the only usage of IRET in STT_FUNC symbols.
So rather than teaching objtool how to interpret the whole
push;push;push;push;push;iret sequence, teach it how big the frame is
(arch_exception_frame_size) and let it continue.
All the other (real) IRETs are in STT_NOTYPE in the entry assembly.
Right, I see.. However I'm not completely convinced by this. I must admit I haven't followed the whole conversation, but what was the issue with the HINT_IRET_SELF? It seemed more elegant, but I might be missing some context.
Otherwise, it might be worth having a comment in the code to point that this only handles the sync_core() case.
Also, instead of adding a special "arch_exception_frame_size", I could suggest:
- Picking this patch [1] from a completely arbitrary source
- Getting rid of INSN_STACK type, any instruction could then include stack ops on top of their existing semantics, they can just have an empty list if they don't touch SP/BP
- x86 decoder adds a stack_op to the iret to modify the stack pointer by the right amount