Re: [GIT PULL] gcc-plugins update for v4.8-rc1

From: Kees Cook
Date: Mon Aug 08 2016 - 19:18:53 EST


On Mon, Aug 8, 2016 at 3:38 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Aug 2, 2016 at 3:20 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>>
>> Please pull this new gcc-plugin for v4.8-rc1.
>
> So I'm not pulling this for four reasons:
>
> (a) most of it is back-merges. There are 8 actual commits - but they
> are preceded by 11 pointless merges of the kbuild branch. No thank
> you.

This was an artifact of development and handling the kbuild maintainer
being on vacation during some of the time. I tried to follow sfr's
advice on best-practices for linux-next, but I guess this didn't work
well. Should I rebase on v4.8-rc1 to clean this up, or is there
something else I can do to make the merges cleaner?

> (b) of the eight actual commits, five are not about the new entropy
> plugin at all, but are generic plugin fixes. Why did those not go
> through the plugin support tree? I get the strong feeling that the
> plugin infrastructure wasn't actually ready.

In theory, this tree is the plugin support tree (now that the core
infrastructure landed by way of the kbuild tree). Perhaps I'm
misunderstanding the segregation of responsibilities on this, though.
(Or maybe I need to add scripts/Makefile.gcc-plugins to the
gcc-plugins section in the MAINTAINERS file?) FWIW, the five clean ups
were identified either during linux-next time (see vacation note
above), or were added to support new plugin features (e.g. the initify
plugin which hasn't yet entered linux-next).

> (c) Of the four remaining commits that actually add entropy, the
> "extra_latent_entropy" thing isn't about the gcc plugin at all, but
> has a pointless "klet's mix it into the plugin pool" code under an
> #ifdef. For no possible reason I can see.

Since it had some dependency on the plugin, this seemed the right
place to land it. I could certainly clean up the code to avoid an
#ifdef in the .c file, if that is of particular note. The reason for
the hash mixing there is to further perturb the latent_entropy value
based on the (possibly warm or otherwise unpredictable) physical
memory contents.

> (d) you don't actually describe what the plugin does in the pull
> request. I had to try to figure it out from reading the commit and the

Fair enough, I can improve the description quite a bit more in the
pull request (and in other places, see below).

> plugin code. And quite frankly, even then it is not actually at all
> obvious to me that this adds any real entropy at all in the situation
> that actually matters most - the embedded "everythingsis the same, and
> everybody uses the same build" situation.

It is certainly hard to measure, but given the instrumentation of
things like interrupt code, loops, and other control flow primitives,
there is unpredictability being added even on identical builds on the
same hardware.

> From what I can tell, 990% of the "entropy" this adds is about
> build-time things (random numbers generated at build-time), and the
> only real runtime entropy it adds comes from the frame pointer. And
> that may well be identical from boot to boot when there are no real
> timing differences.

The build time entropy is used to create the palette of available
random numbers that get mixed into the latent entropy hash. These are
mixed into the RNG pool based on the results of control flow, SMP
ordering, etc.

> The build-time random numbers are just pure garbage. There is *no*
> entropy in them for the situation that matters most. You might as well
> just generate one single random number at build-time without any fancy
> new compiler plugin thing.

Well, no, as mentioned above: runtime patterns do change the resulting
latent entropy. Note that the code does not credit any entropy to the
pool, it just mixes in the bytes, resulting in greater
unpredictability.

> On a real PC, this plugin doesn't seem to matter, so it really seems
> to be mostly about embedded hw without good sources of entropy. But
> I'd really want to hear why such a platform would get noticeably more
> entropy from the frame pointer games.

While the benefits here are primarily focused on early boot
randomness, as the system runs, the unpredictability continues to
slowly get fed into the RNG pool. Think of it as a low-entropy RNG
input.

> So quite honestly, a lot of this smells very much like "security
> theater" to me. Fancy code that makes things look really complex and
> fancy. But where much of it seems to be quite dubious.

It's not theater in the sense that it does measurably improve entropy
on low-entropy devices, and that it doesn't claim to be "real" entropy
(in that it credits 0 bytes per input).

> What are we going to do next? Ask people to remove their shoes while
> compiling the kernel in the name of "security"?

:P For what it's worth, even having the static pre-populated per-build
randomness added to the pools can spoil RNG prediction attempts,
especially if a kernel is being updated regularly (as done by distros,
Android, etc).

> Tell me I'm wrong. Tell me I misread the plugin, and it's stronger
> than my reading implies.

I know you're generally pretty conservative about security changes,
but I think it's stronger than you think it is. I'm not going to claim
it's a magic bullet, but then it doesn't claim that either. It's an
improvement for several use-cases, and is tidily in the corner as a
gcc plugin, so it shouldn't get in the way of a traditional build.

> But that would just be a stronger argument for actually having a
> description of what the plugin does. See my point (d) above.

I tried to help Emese with more verbose commit messages. It sounds
like the commit message in "gcc-plugins: Add latent_entropy plugin"
wasn't sufficient (and that the comments in
scripts/gcc-plugins/latent_entropy_plugin.c aren't sufficient either).

I'm happy to adjust whatever you'd like. :) Where do you think it
would be best to increase verbosity? All of the above (i.e. code
comments, commit message, and pull request message) or just focus on
one area?

As for timing, this code has seen a fair bit of testing (on multiple
architectures) while living in linux-next, so I consider it
operationally ready. If I can clean up the comments/commit logs and
you're otherwise satisfied with my descriptions, would you still
consider this for v4.8 or do you want this delayed beyond v4.8?

Thanks!

-Kees

--
Kees Cook
Nexus Security