Re: [PATCH] random: mix all saved registers into entropy pool

From: JÃrn Engel
Date: Thu Jun 12 2014 - 16:27:36 EST


On Wed, 11 June 2014 11:27:42 -0400, Theodore Ts'o wrote:
> On Tue, Jun 10, 2014 at 08:10:09PM -0400, JÃrn Engel wrote:
> > > I'm also concerned about how much overhead this might eat up. I've
> > > already had someone who was benchmarking a high performance storage
> > > array where the increased interrupt latency before adding something
> > > like this was something he noticed, and kvetched to me about. The
> > > pt_regs structure is going to be larger than the fast_pool (which is
> > > only 16 bytes), and we're going to be doing it once every jiffy, which
> > > means between 100 and 1000 times per second.
> >
> > Would that someone be me? ;)
> >
> > The reason I prefer doing it once every jiffy is that it doesn't
> > change with interrupt load. You get 10k interrupts per second?
> > You pay once per jiffy. 100k interrupts per second? Pay once per
> > jiffy.
>
> No, actually, it was someone who was worried about tail latency. So
> even if the average overhead was small, if once a jiffy we had a much
> larger time spent in the interrupt handler, that's something that
> would a big concern for someone who was worried about big storage
> array performance.
>
> I'd be happier if we used fast_mix() and mixed the registers into the
> fast pool. That's much lighter weight than using mix_pool_bytes(),
> which involves many more memory accesses, MUCH higher probably of
> cache line bouncing between CPU's, etc.
>
> And there has been some proposals that I've been looking at to make
> fast_mix to be use even less overhead, so if we use fast_mix, any
> changes we make to improve that will help here too.

That makes sense to me. It would require replacing current fast_mix()
with somethat doesn't doesn't assume __u32 input[4] as the only
possible parameter. In a previous incarnation of my patch I was using
a single round of siphash to condense the registers and added that to
our fast_pool.

While siphash is fairly fast, doing that for every interrupt seemed
too much for me, hence the current ratelimit approach. But if people
care more about maximum latency than amortized cost, we can combine
siphash (or something similar) with the ratelimit.

At any rate, here is a slightly updated patch that is independent of
CONFIG_HZ.

JÃrn

--
"Error protection by error detection and correction."
-- from a university class