Re: [PATCH v4 6/7] sched: add function nr_running_cpu to expose number of tasks running on cpu

From: Peter Zijlstra
Date: Tue Jul 15 2014 - 05:51:09 EST

On Mon, Jul 14, 2014 at 12:50:50PM -0700, Tim Chen wrote:

> There is a generic multi-buffer infrastructure portion that manages
> pulling and queuing jobs on the crypto workqueue, and it is separated out
> in patch 1 of the patchset.

There's one very weird multi-line comment in that patch.

> The other portions are algorithm specific that defines
> algorithm specific data structure and does the crypto computation
> for a particular algorithm, mostly in
> assemblies and C glue code. The infrastructure code is
> meant to be reused for other similar
> multi-buffer algorithms.

The flushing part that uses the sched thing is sha1 specific, even
though it strikes me as not being so. Flushing buffers on idle seems
like a 'generic' thing.

> We use nr_running_cpu to check whether there are other tasks running on
> the *current* cpu, (not for another cpu),

And yet, the function allows you do to exactly that..

> to decide if we should flush
> and compute crypto jobs accumulated. If there's nobody else running,
> we can take advantage of available cpu cycles on the cpu we are running
> on to do computation on the existing jobs in a SIMD mannner.
> Waiting a bit longer may accumulate more jobs to process in parallel
> in a single SIMD instruction, but will have more delay.

So you already have an idle notifier (which is x86 only, we should fix
that I suppose), and you then double check there really isn't anything
else running.

How much, if anything, does that second check buy you? There's just not
a single word on that.

Also, there is not a word on the latency vs throughput tradeoff you
make. I can imagine that for very short idle durations you loose, not
win with this thing.

So for now I still see no reason for doing this.

Also, I wonder about SMT, the point of this is to make best use of the
SIMD pipelines, does it still make sense to use siblings at the same
time even though you're running hand crafted ASM to stuff the pipelines
to the brim? Should this thing be SMT aware and not gather queues for
both siblings?

Attachment: pgp8UWa5SM8LJ.pgp
Description: PGP signature