Re: [BUG/PATCH] kernel RNG and its secrets

From: Cesar Eduardo Barros
Date: Wed Mar 18 2015 - 20:08:52 EST


On 18-03-2015 14:14, mancha wrote:
On Wed, Mar 18, 2015 at 05:02:01PM +0100, Stephan Mueller wrote:
Am Mittwoch, 18. MÃrz 2015, 16:09:34 schrieb Hannes Frederic Sowa:
Seems like just using barrier() is the best and easiest option.

However, if the idea is to use barrier() instead of OPTIMIZER_HIDE_VAR()
in crypto_memneq() as well, then patch 0002 is the one to use. Please
review and keep in mind my analysis was limited to memzero_explicit().

Cesar, were there reasons you didn't use the gcc version of barrier()
for crypto_memneq()?

Yes. Two reasons.

Take a look at how barrier() is defined:

#define barrier() __asm__ __volatile__("": : :"memory")

It tells gcc that the dummy assembly "instruction" touches memory (so the compiler can't assume anything about the memory), and that nothing should be moved from before to after the barrier and vice versa.

It mentions nothing about registers. Therefore, as far as I know gcc can assume that the dummy "instruction" touches no integer registers (or restores their values). I can imagine a sufficiently perverse compiler using that fact to introduce timing-dependent computations. For instance, it could load the values using more than one register and combine them at the end, after the barriers; there, it could exit early in case one of the registers is all-ones. My definition of OPTIMIZER_HIDE_VAR introduces a data dependency to prevent that:

#define OPTIMIZER_HIDE_VAR(var) __asm__ ("" : "=r" (var) : "0" (var))

The second reason is that barrier() is too strong. For crypto_memneq, only the or-chain is critical; the order or width of the loads makes no difference. The compiler could, if it wishes, do all the loads and xors first and do the or-chain at the end, or whenever it can see a pipeline bubble; it doesn't matter as long as it does *all* the "or" operations, in sequence.

I would be comfortable with a stronger OPTIMIZER_HIDE_VAR (adding "memory" or volatile), even though it could limit optimization opportunities, but I wouldn't be comfortable with a weaker OPTIMIZER_HIDE_VAR (removing the data dependency), unless the gcc and clang guys promise that our definition of barrier() will always prevent undesired optimization of register-only operations.

There was a third reason for the exact definition of OPTIMIZER_HIDE_VAR: it was copied from RELOC_HIDE, which is a longstanding "hide this variable from gcc" operation, and thus known to work as expected.

--
Cesar Eduardo Barros
cesarb@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/