Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails

From: Daniel P. Berrangé
Date: Tue Feb 06 2024 - 08:00:25 EST


On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote:
> On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote:
>
> Good morning to everyone.
>
> > On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote:
> > >
> > > Actually, I now believe there is clear evidence that the problem is
> > > indeed Intel specific. In light of our testing, it will be
> > > interesting to see what your 'AR' returns with respect to an official
> > > response from Intel engineering on this issue.
> > >
> > > One of the very bright young engineers collaborating on Quixote, who
> > > has been following this conversation, took it upon himself to do some
> > > very methodical engineering analysis on this issue. I'm the messenger
> > > but this is very much his work product.
> > >
> > > Executive summary is as follows:
> > >
> > > - No RDRAND depletion failures were observable with either the Intel
> > > or AMD hardware that was load tested.
> > >
> > > - RDSEED depletion is an Intel specific issue, AMD's RDSEED
> > > implementation could not be provoked into failure.
>
> > My colleague ran a multithread parallel stress test program on his
> > 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in
> > RDSEED.
>
> Interesting datapoint, thanks for forwarding it along, so the issue
> shows up on at least some AMD platforms as well.
>
> On the 18 core/socket Intel Skylake platform, the parallelized
> depletion test forces RDSEED success rates down to around 2%. It
> would appear that your tests suggest that the AMD platform fairs
> better than the Intel platform.

Yes, given the speed of the AMD RDRAND/RDSEED ops, compared to my
Intel test platforms, their DRBG looks better able to keep up with
the demand for bits.

> Of course, the other variable may be how the parallelized stress test
> is conducted. If you would like to share your implementation source
> we could give it a twirl on the systems we have access to.

It is just Jason's earlier test program, but moved into one thread
for each core....

$ cat cpurngstress.c
#include <stdio.h>
#include <immintrin.h>
#include <pthread.h>
#include <unistd.h>

/*
* Gives about 25 seconds walllock time on my Alderlake CPU
*
* Probably want to reduce this x10, or possibly even x100
* on AMD due to much slower ops.
*/
#define MAX_ITER 10000000

#define MAX_CPUS 4096

void *doit(void *f) {
unsigned long long rand;
unsigned int i, success_rand = 0, success_seed = 0;

for (i = 0; i < MAX_ITER; ++i) {
success_seed += !!_rdseed64_step(&rand);
}
for (i = 0; i < MAX_ITER; ++i) {
success_rand += !!_rdrand64_step(&rand);
}

fprintf(stderr,
"RDRAND: %.2f%%, RDSEED: %.2f%%\n",
success_rand * 100.0 / MAX_ITER,
success_seed * 100.0 / MAX_ITER);

return NULL;
}


int main(int argc, char *argv[])
{
pthread_t th[MAX_CPUS];
int nproc = sysconf(_SC_NPROCESSORS_ONLN);
if (nproc > MAX_CPUS) {
nproc = MAX_CPUS;
}
fprintf(stderr, "Stressing RDRAND/RDSEED across %d CPUs\n", nproc);

for (int i = 0 ; i < nproc;i ++) {
pthread_create(&th[i], NULL, doit,NULL);
}

for (int i = 0 ; i < nproc;i ++) {
pthread_join(th[i], NULL);
}

return 0;
}

$ gcc -march=native -o cpurngstress cpurngstress.c


> If there is the possibility of over-harvesting randomness, why not
> design the implementations to be clamped at some per core value such
> as a megabit/second. In the case of the documented RDSEED generation
> rates, that would allow the servicing of 3222 cores, if my math at
> 0530 in the morning is correct.
>
> Would a core need more than 128 kilobytes of randomness, ie. one
> second of output, to effectively seed a random number generator?
>
> A cynical conclusion would suggest engineering acquiesing to marketing
> demands... :-)

My assumption is that it was simply easier to not implement a
rate limiting feature at the CPU level and punt the starvation
problem to software :-)

With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|