Re: [PATCH v8 -tip 00/26] Core scheduling

From: Li, Aubrey
Date: Mon Nov 09 2020 - 01:04:51 EST

Next message: Stephen Rothwell: "linux-next: manual merge of the akpm-current tree with the tip tree"
Previous message: Jason Wang: "Re: [PATCH virtio] virtio: virtio_console: fix DMA memory allocation for rproc serial"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2020/11/7 1:54, Joel Fernandes wrote:
> On Fri, Nov 06, 2020 at 10:58:58AM +0800, Li, Aubrey wrote:
>
>>>
>>> -- workload D, new added syscall workload, performance drop in cs_on:
>>> +----------------------+------+-------------------------------+
>>> | | ** | will-it-scale * 192 |
>>> | | | (pipe based context_switch) |
>>> +======================+======+===============================+
>>> | cgroup | ** | cg_will-it-scale |
>>> +----------------------+------+-------------------------------+
>>> | record_item | ** | threads_avg |
>>> +----------------------+------+-------------------------------+
>>> | coresched_normalized | ** | 0.2 |
>>> +----------------------+------+-------------------------------+
>>> | default_normalized | ** | 1 |
>>> +----------------------+------+-------------------------------+
>>> | smtoff_normalized | ** | 0.89 |
>>> +----------------------+------+-------------------------------+
>>
>> will-it-scale may be a very extreme case. The story here is,
>> - On one sibling reader/writer gets blocked and tries to schedule another reader/writer in.
>> - The other sibling tries to wake up reader/writer.
>>
>> Both CPUs are acquiring rq->__lock,
>>
>> So when coresched off, they are two different locks, lock stat(1 second delta) below:
>>
>> class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg
>> &rq->__lock: 210 210 0.10 3.04 180.87 0.86 797 79165021 0.03 20.69 60650198.34 0.77
>>
>> But when coresched on, they are actually one same lock, lock stat(1 second delta) below:
>>
>> class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg
>> &rq->__lock: 6479459 6484857 0.05 216.46 60829776.85 9.38 8346319 15399739 0.03 95.56 81119515.38 5.27
>>
>> This nature of core scheduling may degrade the performance of similar workloads with frequent context switching.
>
> When core sched is off, is SMT off as well? From the above table, it seems to
> be. So even for core sched off, there will be a single lock per physical CPU
> core (assuming SMT is also off) right? Or did I miss something?
>

The table includes 3 cases:
- default: SMT on, coresched off
- coresched: SMT on, coresched on
- smtoff: SMT off, coresched off

I was comparing the default(coresched off & SMT on) case with (coresched
on & SMT on) case.

If SMT off, then reader and writer on the different cores have different rq->lock,
so the lock contention is not that serious.

class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg
&rq->__lock: 60 60 0.11 1.92 41.33 0.69 127 67184172 0.03 22.95 33160428.37 0.49

Does this address your concern?

Thanks,
-Aubrey

Next message: Stephen Rothwell: "linux-next: manual merge of the akpm-current tree with the tip tree"
Previous message: Jason Wang: "Re: [PATCH virtio] virtio: virtio_console: fix DMA memory allocation for rproc serial"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]