Re: [PATCH v7 3/3] x86: vdso: Wire up getrandom() vDSO implementation
From: Jason A. Donenfeld
Date: Sun Nov 27 2022 - 17:07:29 EST
Hi Thomas,
On Sat, Nov 26, 2022 at 12:08:41AM +0100, Thomas Gleixner wrote:
> Jason!
>
> On Thu, Nov 24 2022 at 17:55, Jason A. Donenfeld wrote:
> > +++ b/arch/x86/entry/vdso/vgetrandom-chacha.S
> > +/*
> > + * Very basic SSE2 implementation of ChaCha20. Produces a given positive number
> > + * of blocks of output with a nonce of 0, taking an input key and 8-byte
> > + * counter. Importantly does not spill to the stack. Its arguments are:
>
> Basic or not.
Heh, FYI I didn't mean "basic" here as in "doesn't need a review", but
just that it's a straightforward technique and doesn't do any
complicated multiblock pyrotechnics (which frankly aren't really
needed).
> This needs a Reviewed-by from someone who understands SSE2
> and ChaCha20 before this can go anywhere near the x86 tree.
No problem. I'll see to it that somebody qualified gives this a review.
> > +#include <linux/kernel.h>
>
> Why do you need kernel.h here?
Turns out I don't, thanks.
> > +static __always_inline ssize_t
> > +getrandom_syscall(void *buffer, size_t len, unsigned int flags)
>
> static __always_inline ssize_t getrandom_syscall(void *buffer, size_t len, unsigned int flags)
>
> please. We expanded to 100 quite some time ago.
>
> Some kernel-doc compliant comment for this would be appreciated as well.
Will do.
>
> > +{
> > + long ret;
> > +
> > + asm ("syscall" : "=a" (ret) :
> > + "0" (__NR_getrandom), "D" (buffer), "S" (len), "d" (flags) :
> > + "rcx", "r11", "memory");
> > +
> > + return ret;
> > +}
> > +
> > +#define __vdso_rng_data (VVAR(_vdso_rng_data))
> > +
> > +static __always_inline const struct vdso_rng_data *__arch_get_vdso_rng_data(void)
> > +{
> > + if (__vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS)
> > + return (void *)&__vdso_rng_data +
> > + ((void *)&__timens_vdso_data - (void *)&__vdso_data);
> > + return &__vdso_rng_data;
>
> So either bite the bullet and write it:
>
> if (__vdso_data->clock_mode == VDSO_CLOCKMODE_TIMENS)
> return (void *)&__vdso_rng_data + ((void *)&__timens_vdso_data - (void *)&__vdso_data);
Seems fine to me. I'll write it like that.
> > +/*
> > + * Generates a given positive number of block of ChaCha20 output with nonce=0,
> > + * and does not write to any stack or memory outside of the parameters passed
> > + * to it. This way, we don't need to worry about stack data leaking into forked
> > + * child processes.
>
> Please use proper kernel-doc
>
> > + */
> > +static __always_inline void __arch_chacha20_blocks_nostack(u8 *dst_bytes, const u32 *key, u32 *counter, size_t nblocks)
> > +{
> > + extern void chacha20_blocks_nostack(u8 *dst_bytes, const u32 *key, u32 *counter, size_t nblocks);
> > + return chacha20_blocks_nostack(dst_bytes, key, counter, nblocks);
>
> The above aside, can you please explain the value of this __arch_()
> wrapper?
>
> It's just voodoo for no value because it hands through the arguments
> 1:1. So where are you expecting that that __arch...() version of this is
> any different than invoking the architecture specific version of
> chacha20_blocks_nostack().
I'll just name the assembly function with __arch...(). The idea behind
the wrapper was just to keep all of the non-generic code called from the
generic code prefixed with __arch_, but there's no reason I need to name
it like that from C alone. Will fix for v8.
Thanks again,
Jason