Re: [PATCH v4 6/7] sched: add function nr_running_cpu to expose number of tasks running on cpu

From: Tim Chen
Date: Tue Jul 15 2014 - 14:41:40 EST


On Tue, 2014-07-15 at 11:50 +0200, Peter Zijlstra wrote:
> On Mon, Jul 14, 2014 at 12:50:50PM -0700, Tim Chen wrote:
>
> > There is a generic multi-buffer infrastructure portion that manages
> > pulling and queuing jobs on the crypto workqueue, and it is separated out
> > in patch 1 of the patchset.
>
> There's one very weird multi-line comment in that patch.
>
> > The other portions are algorithm specific that defines
> > algorithm specific data structure and does the crypto computation
> > for a particular algorithm, mostly in
> > assemblies and C glue code. The infrastructure code is
> > meant to be reused for other similar
> > multi-buffer algorithms.
>
> The flushing part that uses the sched thing is sha1 specific, even
> though it strikes me as not being so. Flushing buffers on idle seems
> like a 'generic' thing.
>
> > We use nr_running_cpu to check whether there are other tasks running on
> > the *current* cpu, (not for another cpu),
>
> And yet, the function allows you do to exactly that..

How about a function "single_task_running()"? Something like

bool single_task_running()
{
if (cpu_rq(smp_processor_id())->nr_running == 1)
return true;
else
return false;
}

>
> > to decide if we should flush
> > and compute crypto jobs accumulated. If there's nobody else running,
> > we can take advantage of available cpu cycles on the cpu we are running
> > on to do computation on the existing jobs in a SIMD mannner.
> > Waiting a bit longer may accumulate more jobs to process in parallel
> > in a single SIMD instruction, but will have more delay.
>
> So you already have an idle notifier (which is x86 only, we should fix
> that I suppose), and you then double check there really isn't anything
> else running.
>
> How much, if anything, does that second check buy you? There's just not
> a single word on that.
>
> Also, there is not a word on the latency vs throughput tradeoff you
> make. I can imagine that for very short idle durations you loose, not
> win with this thing.

I am not crazy about the idle_notifier implementation either. Thinking
a bit more, I can probably achieve most of what I need in the code path
that pull jobs from the work queue. I'll flush the remaining
partially processed jobs if I don't have any more jobs to process and no
other tasks are running (i.e. single_task_running() is true).
I'll try to move this logic into the generic multi-buffer job handler
path and out of the idle_notifier.

>
> So for now I still see no reason for doing this.
>
> Also, I wonder about SMT, the point of this is to make best use of the
> SIMD pipelines, does it still make sense to use siblings at the same
> time even though you're running hand crafted ASM to stuff the pipelines
> to the brim? Should this thing be SMT aware and not gather queues for
> both siblings?

I am processing the crypto job on the same cpu that the crypto
jobs originate from to get best cache utilization.
It will be the scheduler's decision to move the task that's
generating a lot crypto jobs to another cpu, if the sibling
is also very busy.

Thanks for your comments and reviews.

Tim

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/