Re: [PATCH v4 6/7] sched: add function nr_running_cpu to expose number of tasks running on cpu

From: Peter Zijlstra
Date: Mon Jul 14 2014 - 14:18:05 EST

On Mon, Jul 14, 2014 at 10:05:34AM -0700, Tim Chen wrote:
> I was trying to explain why the algorithm is implemented this way
> because of its batching nature.
> There is a whole class of async algorithm that can provide
> substantial speedup by doing batch processing and uses workqueue.
> The multi-buffer sha1 version has 2.2x speedup over existing
> AVX2 version, and can have even more speedup when AVX3
> comes round. Workqueue is a natural way to implement
> this. I don't think a throughput speedup of 2.2x is "crap".
> We are not inventing anything new, but ask for a
> very simple helper function to know if there's something else
> running on our cpu to help us make a better decision
> of whether we should flush the batched jobs immediately.
> And also asynchronous crypto interface is already used substantially
> in crypto and has a well established infrastructure.

The crap I was talking about is that there's a metric ton of 'async'
interfaces all different.

Your multi-buffer thing isn't generic either, it seems lmiited to sha1.
It does not reuse padata, it does not extend workqueues, it does not
remove the btrfs nonsense, it adds yet anotehr thing.

Attachment: pgp57CMbS85dY.pgp
Description: PGP signature