Re: lustre: why does cfs_get_random_bytes() exist?

From: Dilger, Andreas
Date: Thu Oct 03 2013 - 19:07:08 EST


On 2013/10/03 1:34 PM, "Theodore Ts'o" <tytso@xxxxxxx> wrote:
>On Thu, Oct 03, 2013 at 10:26:21AM -0700, Greg KH wrote:
>> > Does this sound reasonable?
>>
>> Sounds reasonable to me, care to send a patch to do so?
>>
>
>I can do that, but I was waiting for Andras, Peng or Nikita to let me
>now if there was something I was missing or not. I'm pretty sure it's
>something bogus, perhaps left over from a OS abstraction layer to
>support Solaris or some such, but I am curious what was the historical
>reason for the current code.

The root of the problem with the existing get_random_bytes() is that when
the
Cray supercomputers with around 20k nodes reboot all at the same instant,
we
had problems with the Lustre client UUID (originally from
generate_random_uuid())
being identical between multiple nodes since they have virtually no
entropy at
boot time, and they and did not have hardware RNG.

The Lustre cfs_get_random_bytes() incorporates (via cfs_rand()) a seed
which
also hashes in the addresses from any network interfaces that are
configured.
Conversely, cfs_rand() also is seeded at startup from get_random_bytes() in
case a hardware RNG is available. This ensures even with identical initial
conditions cfs_get_random_bytes() gets a different random stream on each
node.

It is also true that cfs_get_random_bytes() and cfs_rand() predate the
newer
APIs such as prandom_u32() (which, to be honest, I didn't even know existed
today). This is work that was done for maybe 2.4.29? (RH 9?) kernels.

The number of users of cfs_get_random_bytes() is relatively small, mostly
just
for the client UUID and other unique identifiers at startup time. The only
place it is used on an ongoing basis is for the capabilities that are
passed
from the metadata server to clients so the clients can validate they have
access
to the objects on the data server. The rest of the code uses cfs_rand()
for
values that only need to be statistically uniform over time, so they don't
deplete the entropy pool.

I'm not against cleaning this up, if there is some mechanism for the
startup
code to add in the node interface addresses into the entropy pool, and this
is also used to perturb the prandom_u32() sequence after that point.
Besides
that initialization, it would be a simple search & replace for the users.

Cheers, Andreas
--
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/