Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation
From: Jose E. Marchesi
Date: Thu Apr 13 2023 - 14:17:31 EST
> On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
>> Hi Madhavan,
>>
>> At a high-level, I think this still falls afoul of our desire to not reverse
>> engineer control flow from the binary, and so I do not think this is the right
>> approach. I've expanded a bit on that below.
>>
>> I do think it would be nice to have *some* of the objtool changes, as I do
>> think we will want to use objtool for some things in future (e.g. some
>> build-time binary patching such as table sorting).
>>
>> > Problem
>> > =======
>> >
>> > Objtool is complex and highly architecture-dependent. There are a lot of
>> > different checks in objtool that all of the code in the kernel must pass
>> > before livepatch can be enabled. If a check fails, it must be corrected
>> > before we can proceed. Sometimes, the kernel code needs to be fixed.
>> > Sometimes, it is a compiler bug that needs to be fixed. The challenge is
>> > also to prove that all the work is complete for an architecture.
>> >
>> > As such, it presents a great challenge to enable livepatch for an
>> > architecture.
>>
>> There's a more fundamental issue here in that objtool has to reverse-engineer
>> control flow, and so even if the kernel code and compiled code generation is
>> *perfect*, it's possible that objtool won't recognise the structure of the
>> generated code, and won't be able to reverse-engineer the correct control flow.
>>
>> We've seen issues where objtool didn't understand jump tables, so support for
>> that got disabled on x86. A key objection from the arm64 side is that we don't
>> want to disable compile code generation strategies like this. Further, as
>> compiles evolve, their code generation strategies will change, and it's likely
>> there will be other cases that crop up. This is inherently fragile.
>>
>> The key objections from the arm64 side is that we don't want to
>> reverse-engineer details from the binary, as this is complex, fragile, and
>> unstable. This is why we've previously suggested that we should work with
>> compiler folk to get what we need.
>
>> This still requires reverse-engineering the forward-edge control flow in order
>> to compute those offets, so the same objections apply with this approach. I do
>> not think this is the right approach.
>>
>> I would *strongly* prefer that we work with compiler folk to get the
>> information that we need.
>
> IDK if it's relevant here, but I did see a commit go by to LLVM that
> seemed to include such info in a custom ELF section (for the purposes of
> improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
> to see if it's reliable or usable?
> - https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
> - https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow
>
>>
>> [...]
>>
>> > FWIW, I have also compared the CFI I am generating with DWARF
>> > information that the compiler generates. The CFIs match a
>> > 100% for Clang. In the case of gcc, the comparison fails
>> > in 1.7% of the cases. I have analyzed those cases and found
>> > the DWARF information generated by gcc is incorrect. The
>> > ORC generated by my Objtool is correct.
>>
>>
>> Have you reported this to the GCC folk, and can you give any examples?
>> I'm sure they would be interested in fixing this, regardless of whether we end
>> up using it.
>
> Yeah, at least a bug report is good. "See something, say something."
By all means, please. If you guys report these issues on CFI
divergences in the GCC bugzilla, we will look into fixing them.
https://gcc.gnu.org/bugzilla