[PATCH] Late Re Initialization Of Strong RNG (random.c)

From: Konstantin Baydarov
Date: Wed May 13 2009 - 12:17:52 EST


Hello,
Recently I've faced following issue:
When I'm using "jiffies" clocksource ("clock=jiffies" in kernel boot line) and NFS rootfs I'm getting following errors before bash login prompt appears:
NFS: server 10.150.41.21 error: fileid changed
fsid 0:e: expected fileid 0x6cc16f, got 0x9255de
or
NFS: readdir reply truncated!

Errors appears when 2-3 reboots.

Root Case:
Turned out that during every boot, when kernel NFS client initializes XID field of NFS RPC request by calling random32(), random32() returns pseudo random numbers instead of strong random. So every boot Kernel NFS client starts to send RPC requests to NFS server(Linux NFSD) with the same XID value. After couple board reboots, NFS server fills up the requests cache with the pseudorandom XIDs and false NFSD cache hits star to occur, and following errors appear:
fsid 0:e: expected fileid 0x6cc16f, got 0x9255de

My board has an RTC, so values retuned by random32() during NFS XID initialization should unique for every boot, because strong RNG(drivers/char/random.c) uses getnstimeofday() (getnstimeofday called from ktime_get_real) for pool initialization, see init_std_data(). So RNG pools should be initialized with the different "timeofday" values, provided by RTC hw.
It turned out that RTC driver, compilled in kernel, starts later then strong rng(drivers/char/random.c) initializes it's pools. So getnstimeofday(), called in init_std_data(), returns the same "uptime" values instead of the monotonous "timeofday" values(the date in seconds or something like that).

How Solved:
To fix this I've added late reinitialization of strong RND (into late_initcall_sync subsection), to make sure that RNG pools initialization occurs after any of board specific RTC driver initialization. Also I've added random32 reseed after strong RND reinitialization.

I'm using PPC board, and I can't reproduce this issue with "timebase" clocksource, the default clocksource for my board. I believe it's because it has more resolution than "timebase". But I believe that there is a possibility that errors "NFS: readdir reply truncated!" can appear on other board with default clocksource.
Anyway it's safer to initialize RNG with "timeofday" values instead of "uptime" value. IIUC, even with the high resolution clocksource, "uptime" value can be the same for 2 sequential boots.

Patch against linux-2.6.29.3 .

Signed-off-by: Konstantin Baydarov <kbaidarov@xxxxxxxxxxxxx>

Index: linux-2.6.29.3/drivers/char/random.c
===================================================================
--- linux-2.6.29.3.orig/drivers/char/random.c
+++ linux-2.6.29.3/drivers/char/random.c
@@ -947,6 +947,15 @@ static int rand_initialize(void)
}
module_init(rand_initialize);

+static int __init rand_late_initialize(void)
+{
+ rand_initialize();
+ random32_reseed();
+
+ return 0;
+}
+late_initcall_sync(rand_late_initialize);
+
void rand_initialize_irq(int irq)
{
struct timer_rand_state *state;
Index: linux-2.6.29.3/include/linux/random.h
===================================================================
--- linux-2.6.29.3.orig/include/linux/random.h
+++ linux-2.6.29.3/include/linux/random.h
@@ -73,6 +73,7 @@ unsigned long randomize_range(unsigned l

u32 random32(void);
void srandom32(u32 seed);
+int random32_reseed(void);

#endif /* __KERNEL___ */

Index: linux-2.6.29.3/lib/random32.c
===================================================================
--- linux-2.6.29.3.orig/lib/random32.c
+++ linux-2.6.29.3/lib/random32.c
@@ -133,7 +133,7 @@ core_initcall(random32_init);
* Generate better values after random number generator
* is fully initalized.
*/
-static int __init random32_reseed(void)
+int random32_reseed(void)
{
int i;

@@ -151,4 +151,5 @@ static int __init random32_reseed(void)
}
return 0;
}
+EXPORT_SYMBOL(random32_reseed);
late_initcall(random32_reseed);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/