Re: [PATCH v2] sched/fair: sanitize vruntime of entity being migrated

From: Zhang Qiao
Date: Thu Mar 09 2023 - 04:30:20 EST




在 2023/3/9 17:09, Dietmar Eggemann 写道:
> On 09/03/2023 09:37, Zhang Qiao wrote:
>>
>> 在 2023/3/8 20:55, Vincent Guittot 写道:
>>> Le mercredi 08 mars 2023 à 09:01:05 (+0100), Vincent Guittot a écrit :
>>>> On Tue, 7 Mar 2023 at 14:41, Zhang Qiao <zhangqiao22@xxxxxxxxxx> wrote:
>
> [...]
>
>>>>> 在 2023/3/7 18:26, Vincent Guittot 写道:
>>>>>> On Mon, 6 Mar 2023 at 14:53, Vincent Guittot <vincent.guittot@xxxxxxxxxx> wrote:
>>>>>>>
>>>>>>> On Mon, 6 Mar 2023 at 13:57, Zhang Qiao <zhangqiao22@xxxxxxxxxx> wrote:
>
> [...]
>
>>> +static inline bool migrate_long_sleeper(struct sched_entity *se)
>>> +{
>>> + struct cfs_rq *cfs_rq;
>>> + u64 sleep_time;
>>> +
>>> + if (se->exec_start == 0)
>>
>> How about use `se->avg.last_update_time == 0` here?
>
> IMHO, both checks are not needed here since we're still dealing with the
> originating CPU of the migration. Both of them are set to 0 only at the
> end of migrate_task_rq_fair().

Yes, if place_entity() don't call migrate_long_sleeper(), the check can remove.

>
>
>>> + return false;
>>> +
>>> + cfs_rq = cfs_rq_of(se);
>>> + /*
>>> + * If the entity slept for a long time, don't even try to normalize its
>>> + * vruntime with the base as it may be too far off and might generate
>>> + * wrong decision because of s64 overflow.
>>> + * We estimate its sleep duration with the last update of se's pelt.
>>> + * The last update happened before sleeping. The cfs' pelt is not
>>> + * always updated when cfs is idle but this is not a problem because
>>> + * its min_vruntime is not updated too, so the situation can't get
>>> + * worse.
>>> + */
>>> + sleep_time = cfs_rq_last_update_time(cfs_rq) - se->avg.last_update_time;
>
> Looks like this doesn't work for asymmetric CPU capacity systems since
> we specifically do a sync_entity_load_avg() in select_task_rq_fair()
> (find_energy_efficient_cpu() for EAS and select_idle_sibling() for CAS)
> to sync cfs_rq and se (including their last_update_time).
>
> [...]
>
> .
>