Re: [BUG/PATCH] kernel RNG and its secrets
From: Hannes Frederic Sowa
Date: Wed Mar 18 2015 - 08:19:20 EST
On Wed, Mar 18, 2015, at 13:14, Stephan Mueller wrote:
> Am Mittwoch, 18. MÃrz 2015, 13:02:12 schrieb Hannes Frederic Sowa:
>
> Hi Hannes,
>
> >On Wed, Mar 18, 2015, at 12:09, Stephan Mueller wrote:
> >> Am Mittwoch, 18. MÃrz 2015, 11:56:43 schrieb Daniel Borkmann:
> >> >On 03/18/2015 11:50 AM, Hannes Frederic Sowa wrote:
> >> >> On Wed, Mar 18, 2015, at 10:53, mancha wrote:
> >> >>> Hi.
> >> >>>
> >> >>> The kernel RNG introduced memzero_explicit in d4c5efdb9777 to
> >> >>> protect
> >> >>>
> >> >>> memory cleansing against things like dead store optimization:
> >> >>> void memzero_explicit(void *s, size_t count)
> >> >>> {
> >> >>>
> >> >>> memset(s, 0, count);
> >> >>> OPTIMIZER_HIDE_VAR(s);
> >> >>>
> >> >>> }
> >> >>>
> >> >>> OPTIMIZER_HIDE_VAR, introduced in fe8c8a126806 to protect
> >> >>> crypto_memneq>>
> >> >>>
> >> >>> against timing analysis, is defined when using gcc as:
> >> >>> #define OPTIMIZER_HIDE_VAR(var) __asm__ ("" : "=r" (var) :
> >> >>> "0"
> >> >>> (var))
> >> >>>
> >> >>> My tests with gcc 4.8.2 on x86 find it insufficient to prevent
> >> >>> gcc
> >> >>> from optimizing out memset (i.e. secrets remain in memory).
> >> >>>
> >> >>> Two things that do work:
> >> >>> __asm__ __volatile__ ("" : "=r" (var) : "0" (var))
> >> >>
> >> >> You are correct, volatile signature should be added to
> >> >> OPTIMIZER_HIDE_VAR. Because we use an output variable "=r", gcc is
> >> >> allowed to check if it is needed and may remove the asm statement.
> >> >> Another option would be to just use var as an input variable - asm
> >> >> blocks without output variables are always considered being
> >> >> volatile
> >> >> by gcc.
> >> >>
> >> >> Can you send a patch?
> >> >>
> >> >> I don't think it is security critical, as Daniel pointed out, the
> >> >> call
> >> >> will happen because the function is an external call to the crypto
> >> >> functions, thus the compiler has to flush memory on return.
> >> >
> >> >Just had a look.
> >> >
> >> >$ gdb vmlinux
> >> >(gdb) disassemble memzero_explicit
> >> >
> >> >Dump of assembler code for function memzero_explicit:
> >> > 0xffffffff813a18b0 <+0>: push %rbp
> >> > 0xffffffff813a18b1 <+1>: mov %rsi,%rdx
> >> > 0xffffffff813a18b4 <+4>: xor %esi,%esi
> >> > 0xffffffff813a18b6 <+6>: mov %rsp,%rbp
> >> > 0xffffffff813a18b9 <+9>: callq 0xffffffff813a7120
> <memset>
> >> > 0xffffffff813a18be <+14>: pop %rbp
> >> > 0xffffffff813a18bf <+15>: retq
> >> >
> >> >End of assembler dump.
> >> >
> >> >(gdb) disassemble extract_entropy
> >> >[...]
> >> >
> >> > 0xffffffff814a5000 <+304>: sub %r15,%rbx
> >> > 0xffffffff814a5003 <+307>: jne 0xffffffff814a4f80
> >> >
> >> ><extract_entropy+176> 0xffffffff814a5009 <+313>: mov %r12,%rdi
> >> >
> >> > 0xffffffff814a500c <+316>: mov $0xa,%esi
> >> > 0xffffffff814a5011 <+321>: callq 0xffffffff813a18b0
> >> >
> >> ><memzero_explicit> 0xffffffff814a5016 <+326>: mov
> >> >-0x48(%rbp),%rax
> >> >[...]
> >> >
> >> >I would be fine with __volatile__.
> >>
> >> Are we sure that simply adding a __volatile__ works in any case? I
> >> just did a test with a simple user space app:
> >>
> >> static inline void memset_secure(void *s, int c, size_t n)
> >> {
> >>
> >> memset(s, c, n);
> >> //__asm__ __volatile__("": : :"memory");
> >> __asm__ __volatile__("" : "=r" (s) : "0" (s));
> >>
> >> }
> >
> >Good point, thanks!
> >
> >Of course an input or output of s does not force the memory pointed to
> >by s being flushed.
> >
> >
> >My proposal would be to add a
> >
> >#define OPTIMIZER_HIDE_MEM(ptr, len) __asm__ __volatile__ ("" : : "m"(
> >({ struct { u8 b[len]; } *p = (void *)ptr ; *p; }) )
> >
> >and use this in the code function.
> >
> >This is documented in gcc manual 6.43.2.5.
>
> That one adds the zeroization instructuctions. But now there are much
> more than with the barrier.
>
> 400469: 48 c7 04 24 00 00 00 movq $0x0,(%rsp)
> 400470: 00
> 400471: 48 c7 44 24 08 00 00 movq $0x0,0x8(%rsp)
> 400478: 00 00
> 40047a: c7 44 24 10 00 00 00 movl $0x0,0x10(%rsp)
> 400481: 00
> 400482: 48 c7 44 24 20 00 00 movq $0x0,0x20(%rsp)
> 400489: 00 00
> 40048b: 48 c7 44 24 28 00 00 movq $0x0,0x28(%rsp)
> 400492: 00 00
> 400494: c7 44 24 30 00 00 00 movl $0x0,0x30(%rsp)
> 40049b: 00
>
> Any ideas?
Hmm, correct definition of u8?
Which version of gcc do you use? I can't see any difference if I compile
your example at -O2.
Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/