Re: [PATCH] static_call,x86: Robustify trampoline patching
From: Andy Lutomirski
Date: Tue Nov 02 2021 - 17:03:04 EST
On Tue, Nov 2, 2021, at 11:10 AM, Kees Cook wrote:
> On Tue, Nov 02, 2021 at 01:57:44PM +0100, Peter Zijlstra wrote:
>> On Mon, Nov 01, 2021 at 03:14:41PM +0100, Ard Biesheuvel wrote:
>> > On Mon, 1 Nov 2021 at 10:05, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>
>> > > How is that not true for the jump table approach? Like I showed earlier,
>> > > it is *trivial* to reconstruct the actual function pointer from a
>> > > jump-table entry pointer.
>> > >
>> >
>> > That is not the point. The point is that Clang instruments every
>> > indirect call that it emits, to check whether the type of the jump
>> > table entry it is about to call matches the type of the caller. IOW,
>> > the indirect calls can only branch into jump tables, and all jump
>> > table entries in a table each branch to the start of some function of
>> > the same type.
>> >
>> > So the only thing you could achieve by adding or subtracting a
>> > constant value from the indirect call address is either calling
>> > another function of the same type (if you are hitting another entry in
>> > the same table), or failing the CFI type check.
>>
>> Ah, I see, so the call-site needs to have a branch around the indirect
>> call instruction.
>>
>> > Instrumenting the callee only needs something like BTI, and a
>> > consistent use of the landing pads to ensure that you cannot trivially
>> > omit the check by landing right after it.
>>
>> That does bring up another point tho; how are we going to do a kernel
>> that's optimal for both software CFI and hardware aided CFI?
>>
>> All questions that need answering I think.
>
> I'm totally fine with designing a new CFI for a future option,
> but blocking the existing (working) one does not best serve our end
> users.
I like security, but I also like building working systems, and I think I disagree with you. There are a whole bunch of CFI schemes out there, with varying hardware requirements, and they provide varying degrees of fine grained protection and varying degrees of protection against improper speculation. We do not want to merge clang CFI just because it’s “ready” and end up with a mess that makes it harder to support other schemes in the kernel.
So, yes, a good CFI scheme needs caller-side protection, especially if IBT isn’t in use. But a good CFI scheme also needs to interoperate with the rest of the kernel, and this whole “canonical” and symbol-based lookup and static_call thing is nonsense. I think we need a better implementation, whether it uses intrinsics or little C helpers or whatever.
I’m not saying this needs to be incompatible with current clang releases, but I do think we need a clear story for how operations like static call patching are supposed to work.
FYI, Ard, many years ago we merged kernel support for the original gcc stack protector. We have since *removed* it on x86_32 in favor of a nicer implementation that requires a newer toolchain.