Re: [PATCH] random: align entropy_timer_state to cache line
From: Jason A. Donenfeld
Date: Wed Nov 30 2022 - 05:04:40 EST
Hi Eric,
On Tue, Nov 29, 2022 at 08:55:48PM -0800, Eric Biggers wrote:
> Hi Jason,
>
> On Wed, Nov 30, 2022 at 03:08:15AM +0100, Jason A. Donenfeld wrote:
> > The theory behind the jitter dance is that multiple things are poking at
> > the same cache line. This only works, however, if those things are
> > actually all in the same cache line. Ensure this is the case by aligning
> > the struct on the stack to the cache line size.
> >
> > On x86, this indeed aligns the stack struct:
> >
> > 000000000000000c <try_to_generate_entropy>:
> > {
> > c: 55 push %rbp
> > - d: 53 push %rbx
> > - e: 48 83 ec 38 sub $0x38,%rsp
> > + d: 48 89 e5 mov %rsp,%rbp
> > + 10: 41 54 push %r12
> > + 12: 53 push %rbx
> > + 13: 48 83 e4 c0 and $0xffffffffffffffc0,%rsp
> > + 17: 48 83 ec 40 sub $0x40,%rsp
> >
> > Cc: Sultan Alsawaf <sultan@xxxxxxxxxxxxxxx>
> > Fixes: 50ee7529ec45 ("random: try to actively add entropy rather than passively wait for it")
> > Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx>
> > ---
> > drivers/char/random.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/char/random.c b/drivers/char/random.c
> > index 67558b95d531..2494e08c76d8 100644
> > --- a/drivers/char/random.c
> > +++ b/drivers/char/random.c
> > @@ -1262,7 +1262,7 @@ static void __cold entropy_timer(struct timer_list *timer)
> > static void __cold try_to_generate_entropy(void)
> > {
> > enum { NUM_TRIAL_SAMPLES = 8192, MAX_SAMPLES_PER_BIT = HZ / 15 };
> > - struct entropy_timer_state stack;
> > + struct entropy_timer_state stack ____cacheline_aligned;
>
> Several years ago, there was a whole thing about how __attribute__((aligned)) to
> more than 8 bytes doesn't actually work on stack variables in the kernel on x86,
> because the kernel only keeps the stack 8-byte aligned but gcc assumes it is
> 16-byte aligned. See
> https://lore.kernel.org/linux-crypto/20170110143340.GA3787@xxxxxxxxxxxxxxxxxxx/T/#t
>
> IIRC, nothing was done about it at the time.
>
> Has that been resolved in the intervening years?
Maybe things are different for ____cacheline_aligned, which is 64 bytes.
Reading that thread, it looks like it was a case of trying to align the
stack to 16 bytes, but gcc assumed 16 bytes already while the kernel
only gave it 8. So gcc didn't think it needed to emit any code to align
it. Here, though, it's 64, and gcc certainly isn't assuming 64-byte
stack alignment.
Looking at the codegen, gcc appears to doing `rsp = (rsp & ~63) - 64`,
which appears correct.
Jason