Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

From: Subhra Mazumdar
Date: Fri Mar 22 2019 - 20:09:51 EST



On 3/21/19 2:20 PM, Julien Desfossez wrote:
On Tue, Mar 19, 2019 at 10:31 PM Subhra Mazumdar <subhra.mazumdar@xxxxxxxxxx>
wrote:
On 3/18/19 8:41 AM, Julien Desfossez wrote:

On further investigation, we could see that the contention is mostly in the
way rq locks are taken. With this patchset, we lock the whole core if
cpu.tag is set for at least one cgroup. Due to this, __schedule() is more or
less serialized for the core and that attributes to the performance loss
that we are seeing. We also saw that newidle_balance() takes considerably
long time in load_balance() due to the rq spinlock contention. Do you think
it would help if the core-wide locking was only performed when absolutely
needed ?

Is the core wide lock primarily responsible for the regression? I ran upto patch
12 which also has the core wide lock for tagged cgroups and also calls
newidle_balance() from pick_next_task(). I don't see any regression. Of course
the core sched version of pick_next_task() may be doing more but comparing with
the __pick_next_task() it doesn't look too horrible.