the random driver

From: Andrew Morton (akpm@digeo.com)
Date: Wed Nov 20 2002 - 02:46:53 EST


I'm seeing a context switch rate of 150/sec just writing a
big file to disk at 20 megabytes/sec.

It is coming out of add_disk_randomness()'s invokation of
batch_entropy_store().

That function is setting up deferred punt-to-process-context
for every disk request, and has the potential to cause 1000
context switches per second. This is clearly excessive.

There is a 256 slot buffer in the random driver for this,
and we are not using it at all effectively. I do intend
to submit the below patch which will cause one context switch
per 128 requests.

But this is a minimal fix. The batch_entropy_pool handling
in there needs work.

a) It's racy. The head and tail pointers have no SMP protection
   and a race will cause it to dump 128 already-processed items
   back into the entropy pool.

b) It's weird. What's up with this?

        batch_entropy_pool[2*batch_head] = a;
        batch_entropy_pool[(2*batch_head) + 1] = b;

   It should be an array of 2-element structures.

c) The ring-buffer handling is awkward. It shouldn't be masking
   the head and tail pointers to always remain within bounds.

   A better technique is to allow these indices to wrap at
   0xffffffff and only mask their values when you actually use
   them as a subscript. This allows you to distinguish the
   completely-full case from the completely-empty one. See
   LOG_BUF* in kernel/printk.c.

d) It's punting work up to process context which could be performed
   right there in interrupt context.

My suggestion, if anyone cares, is to convert the entropy pool
into smaller per-cpu buffers, protected by local_irq_save() only.
This way the global lock (which isn't there yet) only needs to
be taken when a CPU is actually dumping its buffer.

--- 25/drivers/char/random.c~reduce-random-context-switch-rate Tue Nov 19 23:17:12 2002
+++ 25-akpm/drivers/char/random.c Tue Nov 19 23:18:11 2002
@@ -663,7 +663,7 @@ void batch_entropy_store(u32 a, u32 b, i
         batch_entropy_credit[batch_head] = num;
 
         new = (batch_head+1) & (batch_max-1);
- if (new != batch_tail) {
+ if ((new - batch_tail) >= batch_max / 2) {
                 /*
                  * Schedule it for the next timer tick:
                  */

_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Nov 23 2002 - 22:00:31 EST