Re: objtool: Seeking help for improving switch table processing
From: Peter Zijlstra
Date: Tue Jun 27 2023 - 04:55:50 EST
On Sat, Jun 24, 2023 at 10:06:23AM +0000, Christophe Leroy wrote:
> Hello Josh and Peter,
>
> As mentionned in the cover letter of my series "powerpc/objtool: uaccess
> validation for PPC32 (v3)" [1], a few switch table lookup fail, and it
> would help if you had ideas on how to handle them.
>
> First one is as follows. First switch is properly detected, second is not.
>
> 0000 00003818 <vsnprintf>:
> ...
> 0054 386c: 3f 40 00 00 lis r26,0 386e: R_PPC_ADDR16_HA .rodata+0x6c
> 0058 3870: 3f 20 00 00 lis r25,0 3872: R_PPC_ADDR16_HA .rodata+0x4c
> 005c 3874: 7f be eb 78 mr r30,r29
> 0060 3878: 3b 5a 00 00 addi r26,r26,0 387a: R_PPC_ADDR16_LO
> .rodata+0x6c <== First switch table address loaded in r26 register
> 0064 387c: 3b 39 00 00 addi r25,r25,0 387e: R_PPC_ADDR16_LO
> .rodata+0x4c <== Second switch table address loaded in r25 register
> ...
> 009c 38b4: 41 9d 02 64 bgt cr7,3b18 <vsnprintf+0x300> <==
> Conditional jump to where second switch is
> 00a0 38b8: 55 29 10 3a slwi r9,r9,2
> 00a4 38bc: 7d 39 48 2e lwzx r9,r25,r9
> 00a8 38c0: 7d 29 ca 14 add r9,r9,r25
> 00ac 38c4: 7d 29 03 a6 mtctr r9
> 00b0 38c8: 4e 80 04 20 bctr <== Dynamic switch branch based on r25
> register
> ...
> 0300 3b18: 39 29 ff f8 addi r9,r9,-8
> 0304 3b1c: 55 2a 06 3e clrlwi r10,r9,24
> 0308 3b20: 2b 8a 00 0a cmplwi cr7,r10,10
> 030c 3b24: 89 3f 00 00 lbz r9,0(r31)
> 0310 3b28: 41 9d 01 88 bgt cr7,3cb0 <vsnprintf+0x498>
> 0314 3b2c: 55 4a 10 3a slwi r10,r10,2
> 0318 3b30: 7d 5a 50 2e lwzx r10,r26,r10
> 031c 3b34: 7d 4a d2 14 add r10,r10,r26
> 0320 3b38: 7d 49 03 a6 mtctr r10
> 0324 3b3c: 4e 80 04 20 bctr <== Dynamic switch branch based on r26
> register
> ...
Josh is the one that knows most about the jump table stuff, but I think
he's traveling or something like that atm so he might be a little slow.
Is the problem above that both the .rodata references are before the
conditional jump, such that objtool fails to correlate the indirect jump
and .rodata ?
Looking at mark_func_jump_table() that only seems to consider
unconditional jumps wrt jump-tables and the above doesn't match this
pattern.
Worse is that the two jump tables are interleaved, this means the only
way to untangle things is to actually track the register state :/
Specifically, if GCC wanted it could flip the r25 and r26 loads and then
objtool wouldn't be able to match any of them I think. Because at that
point the first jump-table would match the r26 jump-table or so (I
think, I've not fully considered the current code).
Ho-humm... what a tangle.
So for AARGH64 we also had trouble with jump-tables, but LLVM-BOLT
managed to get that working:
https://github.com/llvm/llvm-project/blob/main/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp#L458
perhaps we can glean a clue there, but I don't immediately see the same
patterns there.
I can't seem to come up with anything better than tracking the register
state, and effectively working back from 'ctr' to a .rodata. That's
going to be a bit of effort though...