I'm seeing a context switch rate of 150/sec just writing a
big file to disk at 20 megabytes/sec.
It is coming out of add_disk_randomness()'s invokation of
batch_entropy_store().
That function is setting up deferred punt-to-process-context
for every disk request, and has the potential to cause 1000
context switches per second. This is clearly excessive.
There is a 256 slot buffer in the random driver for this,
and we are not using it at all effectively. I do intend
to submit the below patch which will cause one context switch
per 128 requests.
But this is a minimal fix. The batch_entropy_pool handling
in there needs work.
a) It's racy. The head and tail pointers have no SMP protection
and a race will cause it to dump 128 already-processed items
back into the entropy pool.
b) It's weird. What's up with this?
batch_entropy_pool[2*batch_head] = a;
batch_entropy_pool[(2*batch_head) + 1] = b;
It should be an array of 2-element structures.
c) The ring-buffer handling is awkward. It shouldn't be masking
the head and tail pointers to always remain within bounds.
A better technique is to allow these indices to wrap at
0xffffffff and only mask their values when you actually use
them as a subscript. This allows you to distinguish the
completely-full case from the completely-empty one. See
LOG_BUF* in kernel/printk.c.
d) It's punting work up to process context which could be performed
right there in interrupt context.
My suggestion, if anyone cares, is to convert the entropy pool
into smaller per-cpu buffers, protected by local_irq_save() only.
This way the global lock (which isn't there yet) only needs to
be taken when a CPU is actually dumping its buffer.
--- 25/drivers/char/random.c~reduce-random-context-switch-rate Tue Nov 19 23:17:12 2002
+++ 25-akpm/drivers/char/random.c Tue Nov 19 23:18:11 2002
@@ -663,7 +663,7 @@ void batch_entropy_store(u32 a, u32 b, i
batch_entropy_credit[batch_head] = num;
new = (batch_head+1) & (batch_max-1);
- if (new != batch_tail) {
+ if ((new - batch_tail) >= batch_max / 2) {
/*
* Schedule it for the next timer tick:
*/
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sat Nov 23 2002 - 22:00:31 EST