Re: [PATCH crypto-stable] crypto: arch/lib - limit simd usage to PAGE_SIZE chunks

From: Jason A. Donenfeld
Date: Wed Apr 22 2020 - 15:35:29 EST


On Wed, Apr 22, 2020 at 5:28 AM Sebastian Andrzej Siewior
<bigeasy@xxxxxxxxxxxxx> wrote:
>
> On 2020-04-22 09:23:34 [+0200], Ard Biesheuvel wrote:
> > My memory is a bit fuzzy here. I remember talking to the linux-rt guys
> > about what delay is actually acceptable, which was a lot higher than I
> > had thought based on their initial reports about scheduling blackouts
> > on arm64 due to preemption remaining disabled for too long. I intended
> > to revisit this with more accurate bounds but then I apparently
> > forgot.
> >
> > So SIMD chacha20 and SIMD poly1305 both run in <5 cycles per bytes,
> > both on x86 and ARM. If we take 20 microseconds as a ballpark upper
> > bound for how long preemption may be disabled, that gives us ~4000
> > bytes of ChaCha20 or Poly1305 on a hypothetical 1 GHz core.
> >
> > So I think 4 KB is indeed a reasonable quantum of work here. Only
> > PAGE_SIZE is not necessarily equal to 4 KB on arm64, so we should use
> > SZ_4K instead.
> >
> > *However*, at the time, the report was triggered by the fact that we
> > were keeping SIMD enabled across calls into the scatterwalk API, which
> > may call kmalloc()/kfree() etc. There is no need for that anymore, now
> > that the FPU begin/end routines all have been optimized to restore the
> > userland SIMD state lazily.
>
> The 20usec sound reasonable. The other concern was memory allocation
> within the preempt-disable section. If this is no longer the case,
> perfect.

Cool, thanks for the confirmation. I'll get a v2 of this patch out the door.