Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

From: Subhra Mazumdar
Date: Tue Mar 26 2019 - 21:05:45 EST



On 3/22/19 5:06 PM, Subhra Mazumdar wrote:

On 3/21/19 2:20 PM, Julien Desfossez wrote:
On Tue, Mar 19, 2019 at 10:31 PM Subhra Mazumdar <subhra.mazumdar@xxxxxxxxxx>
wrote:
On 3/18/19 8:41 AM, Julien Desfossez wrote:

On further investigation, we could see that the contention is mostly in the
way rq locks are taken. With this patchset, we lock the whole core if
cpu.tag is set for at least one cgroup. Due to this, __schedule() is more or
less serialized for the core and that attributes to the performance loss
that we are seeing. We also saw that newidle_balance() takes considerably
long time in load_balance() due to the rq spinlock contention. Do you think
it would help if the core-wide locking was only performed when absolutely
needed ?

Is the core wide lock primarily responsible for the regression? I ran upto patch
12 which also has the core wide lock for tagged cgroups and also calls
newidle_balance() from pick_next_task(). I don't see any regression. Of course
the core sched version of pick_next_task() may be doing more but comparing with
the __pick_next_task() it doesn't look too horrible.
I gathered some data with only 1 DB instance running (which also has 52% slow
down). Following are the numbers of pick_next_task() calls and their avg cost
for patch 12 and patch 15. The total number of calls seems to be similar but the
avg cost (in us) has more than doubled. For both the patches I had put the DB
instance into a cpu tagged cgroup.

ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ patch12 patch15
count pick_next_taskÂÂÂÂÂÂÂÂ 62317898 58925395
avg cost pick_next_taskÂÂÂÂÂ 0.6566323209ÂÂ 1.4223810108