Am Mittwoch, 18. März 2015, 13:19:07 schrieb Hannes Frederic Sowa:
Hi Hannes,
On Wed, Mar 18, 2015, at 13:14, Stephan Mueller wrote:
Am Mittwoch, 18. März 2015, 13:02:12 schrieb Hannes Frederic Sowa:
Hi Hannes,
On Wed, Mar 18, 2015, at 12:09, Stephan Mueller wrote:
Am Mittwoch, 18. März 2015, 11:56:43 schrieb Daniel Borkmann:
On 03/18/2015 11:50 AM, Hannes Frederic Sowa wrote:
On Wed, Mar 18, 2015, at 10:53, mancha wrote:
Hi.
The kernel RNG introduced memzero_explicit in d4c5efdb9777 to
protect
memory cleansing against things like dead store optimization:
void memzero_explicit(void *s, size_t count)
{
memset(s, 0, count);
OPTIMIZER_HIDE_VAR(s);
}
OPTIMIZER_HIDE_VAR, introduced in fe8c8a126806 to protect
crypto_memneq>>
against timing analysis, is defined when using gcc as:
#define OPTIMIZER_HIDE_VAR(var) __asm__ ("" : "=r" (var) :
"0"
(var))
My tests with gcc 4.8.2 on x86 find it insufficient to prevent
gcc
from optimizing out memset (i.e. secrets remain in memory).
Two things that do work:
__asm__ __volatile__ ("" : "=r" (var) : "0" (var))
You are correct, volatile signature should be added to
OPTIMIZER_HIDE_VAR. Because we use an output variable "=r", gcc
is
allowed to check if it is needed and may remove the asm
statement.
Another option would be to just use var as an input variable -
asm
blocks without output variables are always considered being
volatile
by gcc.
Can you send a patch?
I don't think it is security critical, as Daniel pointed out,
the
call
will happen because the function is an external call to the
crypto
functions, thus the compiler has to flush memory on return.
Just had a look.
$ gdb vmlinux
(gdb) disassemble memzero_explicit
Dump of assembler code for function memzero_explicit:
0xffffffff813a18b0 <+0>: push %rbp
0xffffffff813a18b1 <+1>: mov %rsi,%rdx
0xffffffff813a18b4 <+4>: xor %esi,%esi
0xffffffff813a18b6 <+6>: mov %rsp,%rbp
0xffffffff813a18b9 <+9>: callq 0xffffffff813a7120
<memset>
0xffffffff813a18be <+14>: pop %rbp
0xffffffff813a18bf <+15>: retq
End of assembler dump.
(gdb) disassemble extract_entropy
[...]
0xffffffff814a5000 <+304>: sub %r15,%rbx
0xffffffff814a5003 <+307>: jne 0xffffffff814a4f80
<extract_entropy+176> 0xffffffff814a5009 <+313>: mov %r12,%rdi
0xffffffff814a500c <+316>: mov $0xa,%esi
0xffffffff814a5011 <+321>: callq 0xffffffff813a18b0
<memzero_explicit> 0xffffffff814a5016 <+326>: mov
-0x48(%rbp),%rax
[...]
I would be fine with __volatile__.
Are we sure that simply adding a __volatile__ works in any case? I
just did a test with a simple user space app:
static inline void memset_secure(void *s, int c, size_t n)
{
memset(s, c, n);
//__asm__ __volatile__("": : :"memory");
__asm__ __volatile__("" : "=r" (s) : "0" (s));
}
Good point, thanks!
Of course an input or output of s does not force the memory pointed
to
by s being flushed.
My proposal would be to add a
#define OPTIMIZER_HIDE_MEM(ptr, len) __asm__ __volatile__ ("" : :
"m"(
({ struct { u8 b[len]; } *p = (void *)ptr ; *p; }) )
and use this in the code function.
This is documented in gcc manual 6.43.2.5.
That one adds the zeroization instructuctions. But now there are much
more than with the barrier.
400469: 48 c7 04 24 00 00 00 movq $0x0,(%rsp)
400470: 00
400471: 48 c7 44 24 08 00 00 movq $0x0,0x8(%rsp)
400478: 00 00
40047a: c7 44 24 10 00 00 00 movl $0x0,0x10(%rsp)
400481: 00
400482: 48 c7 44 24 20 00 00 movq $0x0,0x20(%rsp)
400489: 00 00
40048b: 48 c7 44 24 28 00 00 movq $0x0,0x28(%rsp)
400492: 00 00
400494: c7 44 24 30 00 00 00 movl $0x0,0x30(%rsp)
40049b: 00
Any ideas?
Hmm, correct definition of u8?
I use unsigned char
Which version of gcc do you use? I can't see any difference if I
compile your example at -O2.
gcc-Version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC)