On Tue, Nov 02, 2021 at 02:02:38PM -0700, Andy Lutomirski wrote:
On Tue, Nov 2, 2021, at 11:10 AM, Kees Cook wrote:
On Tue, Nov 02, 2021 at 01:57:44PM +0100, Peter Zijlstra wrote:
On Mon, Nov 01, 2021 at 03:14:41PM +0100, Ard Biesheuvel wrote:
On Mon, 1 Nov 2021 at 10:05, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
How is that not true for the jump table approach? Like I showed earlier,
it is *trivial* to reconstruct the actual function pointer from a
jump-table entry pointer.
That is not the point. The point is that Clang instruments every
indirect call that it emits, to check whether the type of the jump
table entry it is about to call matches the type of the caller. IOW,
the indirect calls can only branch into jump tables, and all jump
table entries in a table each branch to the start of some function of
the same type.
So the only thing you could achieve by adding or subtracting a
constant value from the indirect call address is either calling
another function of the same type (if you are hitting another entry in
the same table), or failing the CFI type check.
Ah, I see, so the call-site needs to have a branch around the indirect
call instruction.
Instrumenting the callee only needs something like BTI, and a
consistent use of the landing pads to ensure that you cannot trivially
omit the check by landing right after it.
That does bring up another point tho; how are we going to do a kernel
that's optimal for both software CFI and hardware aided CFI?
All questions that need answering I think.
I'm totally fine with designing a new CFI for a future option,
but blocking the existing (working) one does not best serve our end
users.
I like security, but I also like building working systems, and I think
I disagree with you. There are a whole bunch of CFI schemes out there,
with varying hardware requirements, and they provide varying degrees
of fine grained protection and varying degrees of protection against
improper speculation. We do not want to merge clang CFI just because
it’s “ready” and end up with a mess that makes it harder to support
other schemes in the kernel.
Right, and I see the difficulties here. And speaking to Peter's
observation that CFI "accidentally" worked with static_calls, I don't
see it that way: it worked because it was designed to be as "invisible"
as possible. It's just that at a certain point of extreme binary output
control, it becomes an issue and I think that's going to be true for
*any* CFI system: they each will have different design characteristics.
One of the goals of the Clang CFI use in Linux was to make it as
minimally invasive as possible (and you can see this guiding Sami's
choices: e.g. he didn't go change all the opaque address uses to need a
"&" prefix added, etc). I think we're always going to have some
push/pull between the compiler's "general"ness and the kernel's
"specific"ness.
So, yes, a good CFI scheme needs caller-side protection, especially if
IBT isn’t in use. But a good CFI scheme also needs to interoperate with
the rest of the kernel, and this whole “canonical” and symbol-based
lookup and static_call thing is nonsense. I think we need a better
implementation, whether it uses intrinsics or little C helpers or
whatever.
I think we're very close already. Like I said, I think it's fine to nail
down some of these interoperability requirements; we've been doing it
all along. We got there with arm64, and it looks to me like we're almost
there on x86. There is this particular case with static_calls now, but I
don't think it's insurmountable.
Sure, and this is the kind of thing I mean: we had an awkward
implementation of a meaningful defense, and we improved on it. I think
it's important to be able to make these kinds of concessions to gain the
defensive features they provide. And yes, we can continue to improve it,
but in the meantime, we can stop entire classes of problems from
happening to our user base.