Re: [PATCH 0/2] nohz_full: Offload task_tick to remote housekeeping cpus for nohz_full cpus
From: preetium
Date: Thu Aug 20 2015 - 09:22:24 EST
> On Thu, Aug 13, 2015 at 05:05:45PM +0200, Peter Zijlstra wrote:
>> On Thu, Aug 13, 2015 at 02:44:02PM +0200, Frederic Weisbecker wrote:
>> > On Thu, Aug 13, 2015 at 02:22:23PM +0200, Peter Zijlstra wrote:
>> > > On Thu, Aug 13, 2015 at 02:55:36PM +0530, Vatika Harlalka wrote:
>> > > > This patchset is for offloading task_tick() to a remote
>> housekeeping
>> > > > cpu. The larger aim is to stop ticks on nohz_full cpus. For this,
>> extra
>> > > > work must be done by housekeeping cpus. So, task_tick is called
>> from a
>> > > > delayed workqueue for nohz_full cpus and the work is requeued
>> every second
>> > > > for those nohz_full cpus whose ticks are stopped while they are
>> busy. In
>> > > > the rest of the cases it will lead to redundant accounting. To
>> facilitate
>> > > > this, a new function tick_nohz_remote_tick_stopped is added to
>> indicate
>> > > > whether ticks are stopped on a remote cpu.
>> > > > Tick related code in core.c is moved to tick.c
>> > >
>> > > *sigh* of course you didn't read what I've written on this topic..
>> >
>> > What is it? Note Vatika wrote this after my suggestion, so if there is
>> an issue,
>> > I'm likely the responsible :-) But I don't recall you opposed to this
>> solution.
>>
>> *sigh* of course you _could_ all use Google yourselves.
>>
>> Re-read: https://patches.linaro.org/28290/
>
> Sorry, there were dozens of threads about this issue and I got a bit
> confused.
>
>>
>> I see nothing like the stuff I asked for in here, on top it creates the
>> stupid tick.c file.
>
> Right. I initially thought that we should make sched_tick() just work with
> long delays.
> Then tglx suggested the offline idea but I lost track about our
> conversation.
>
> But yeah making that scheduler_tick() working with long delays sound much
> better. Certainly
> much more work but that's a natural evolution after all. It should pay in
> longer term.
>
> We can start with update_cpu_load_active() which only works with HZ
> frequency updates or
> nohz idle zero load decay. Now I think that stuff is only used for load
> balancing. I had
> hopes this thing could be removed. I think Alex Shin (IIRC) tried but the
> patchset didn't
> make it.
I don't think Peter is talking about delays in updating the scheduler stats.
Looking at the earlier discussion, it looks like we need to do periodic tick
tasks only on demand on the nohz_full cpus. We will perhaps need to do the
following(reiterating some points that Peter said earlier) :
1. One of the tasks that scheduler_tick() does is trigger_load_balance(). If
we have to get rid of the residual tick, we need to move load balancing on
nohz_full cpus into nohz_idle_balance(). In addition to load balancing on
the idle cpus, this routine will load balance on the nohz_full cpus as well,
when they are running single tasks.
This seems to be a good move because it will avoid pulling more tasks on
to the nohz_full cpus, when they are running single tasks, unless needed.
2. In nohz_idle_load_balance(), there needs to be routines similar to
update_idle_cpu_load() for nohz_full cpus so that the cpu loads are updated
before triggering load balance on them. Lets call this
update_nohz_full_cpu_load().
This should include update_curr() and update_cpu_load_active() for nohz_full
cpus.
3. When scheduling stats are read, update_curr() and
update_cpu_load_active() will
be called remotely.
The above three will ensure that work done during the scheduling tick is
always
on demand on the nohz_full cpus; i.e. during load balancing and reading of
stats.
Peter, Frederic can you let us know if this is the right approach ?
Regards
Preeti U Murthy
>
> Thanks.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/