Re: CONFIG_RANDOM (compromise?)

Albert Cahalan (albert@ccs.neu.edu)
Tue, 28 May 1996 23:03:02 -0400 (EDT)


Perhaps someone with 64kB of cache could test this.
(common on 486SLC2, 486SX, 486SL, and such systems)

> I've done by own cycle counting benchmark. Each call to mix in
> randomness only takes 150 to 1000 cycles, depending on memory caching
> effects (this is on a laptop without a lot of cache memory). On a 75MHz
> Pentium, this translates to 2 to 13 microseconds, with the average being
> 8 microseconds. This is *not* a lot of time --- consider that the
> default Linux clock tick is 0.01 seconds, and we're talking about
> 0.000008 seconds.

On a 386SX-16?
My guess: value*clockchange*arch = 0.000008*(75/16)*(2*1.5) = 0.0001
Now, multiply that by the number of calls. At 1000/second, the system
would be spending maybe 10% of the time getting entropy. That is
of course a wild guess, but it's better than a BogoMIPS-based estimate.

> When you consider that during the hdparm benchmark, the process is
> blocked doing I/O most of the time, claims of significant I/O
> degredation are extremely suspicious. My own trials using hdparm -t,

It shows one aspect of performance. You could get a better picture
by running a CPU benchmark at the same time.

> Next: It might be desirable to collect random events in the kernel.
> It is _not_ desirable to store and mix them with a complicated
> hash function in the kernel.
>
> Take a closer look at the code. You will see that it is not using a
> complicated hash function to mix in the random events. This was done by
> design, so that it wouldn't take much CPU cycles to mix in the
> randomness, while still doing a good job.
>
> It might be possible to move the complicated hash function (which is
> used when you actually read from /dev/random) out to a user-mode daemon,
> but you wouldn't be able to move out to a library, because of user to
> user security concerns. (A random user program can't be allowed to
> directly see random pool, since that would completely compromise the
> security of the random pool.) The real problem with a user-mode daemon,

With a library alone:
You wouldn't keep the old pool after giving it to a user.

With a daemon:
Eliminates the need for /dev/urandom.
Makes it easy to find out who is hogging the entropy.
Is better for storing a really _large_ entropy pool. (a gigabyte even)
Prevents using /dev/random to detect when someone is using the machine.

> though, is that it would signicantly reduce the chances that a user
> would have their system correctly configured to be able to use
> /dev/random, and as a result application programmers would be much less
> likely to use it. Security must be easy to use, or people won't use it.

The chances are slim now anyway. Either you create /dev/random
by hand or install the daemon by hand. With a new distribution,
any method would be included.

> I've considered collecting the random events during interrupt time and
> only placing them in a ring buffer, and then only mixing them into the
> pool using a bottom-half driver. It's not clear that it'll save enough
> clock cycles to actually be worth it, though. I'll do some benchmark
> tests and see how they work out.

Thank you.

> Does anyone else think the kernel has too many checksum/CRC/hash
> functions? We need one for Elf, one(?) for networking, and one
> for ftape. The decompression code does not need its own CRC.
> Neither does the /dev/random.
>
> The networking code will eventually need to use both SHA and MD5; the
> protocols demand it. Similarily, the decompression code needs to use
> CRC, since that's what the gzip format requires. During the Berlin
> conference, I talked to Alan about separating out the hash functions
> which are currently in /dev/random into the lib directory of the kernel
> during the 2.1 kernel development, since it's clear that future security
> networking protocols will also require access to the SHA and MD5
> functions.

Since the networking protocols require SHA and MD5, those must be
in the kernel. The particular way that Linux boots is not an ISO
or RFC anything, so that can be changed. It should be simple to
write a program that replaces the gzip CRC (and magic) with something
else. Maybe the FSF would even accept an alternate algorithm to be
used only when gzip is explicitly told to do so. For now though, a
simple hack_gz_for_linux could switch to MD5 or SHA.

> P.S. In the latest incarnation, the /dev/random driver now only takes
> 10k of memory. I've done some further optimization of its memory
> footprint, mostly by eliminating inline functions where they weren't
> necessary. The driver is now also being used by the TCP layer to
> generate secure TCP sequence numbers. These patches will hopefully be
> going into pre-2.0.9, although I'll be asking Alan to vet the TCP
> changes first. They're pretty straight forward, though.

That's an improvement I like very much. I still think that it is
rather cruel to make everyone collect entropy. BTW, can we call
it /dev/entropy? There's been some confusion already, and this
is the linux-kernel list. See what happens when people (both the
newbies and the do-it-in-userspace diehards) find /dev/random on
new Linux systems.