Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed

From: Zhang Qiao
Date: Thu Mar 02 2023 - 04:36:25 EST




在 2023/2/27 22:37, Vincent Guittot 写道:
> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@xxxxxxxxx> wrote:
>>
>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@xxxxxxxxx> wrote:
>>>> What scares me, though, is that I've got a message from the test robot
>>>> that this commit drammatically affected hackbench results, see the quote
>>>> below. I expected the commit not to affect any benchmarks.
>>>>
>>>> Any idea what could have caused this change?
>>>
>>> Hmm, It's most probably because se->exec_start is reset after a
>>> migration and the condition becomes true for newly migrated task
>>> whereas its vruntime should be after min_vruntime.
>>>
>>> We have missed this condition
>>
>> Makes sense to me.
>>
>> But what would then be the reliable way to detect a sched_entity which
>> has slept for long and risks overflowing in .vruntime comparison?
>
> For now I don't have a better idea than adding the same check in
> migrate_task_rq_fair()

Hi, Vincent,
I fixed this condition as you said, and the test results are as follows.

testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100
version1: v6.2
version2: v6.2 + commit 829c1651e9c4
version3: v6.2 + commit 829c1651e9c4 + this patch

-------------------------------------------------
version1 version2 version3
test1 81.0 118.1 82.1
test2 82.1 116.9 80.3
test3 83.2 103.9 83.3
avg(s) 82.1 113.0 81.9

-------------------------------------------------
After deal with the task migration case, the hackbench result has restored.

The patch as follow, how does this look?

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ff4dbbae3b10..3a88d20fd29e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se)
#endif
}

+static inline u64 sched_sleeper_credit(struct sched_entity *se)
+{
+
+ unsigned long thresh;
+
+ if (se_is_idle(se))
+ thresh = sysctl_sched_min_granularity;
+ else
+ thresh = sysctl_sched_latency;
+
+ /*
+ * Halve their sleep time's effect, to allow
+ * for a gentler effect of sleepers:
+ */
+ if (sched_feat(GENTLE_FAIR_SLEEPERS))
+ thresh >>= 1;
+
+ return thresh;
+}
+
static void
place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
{
@@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
vruntime += sched_vslice(cfs_rq, se);

/* sleeps up to a single latency don't count. */
- if (!initial) {
- unsigned long thresh;
-
- if (se_is_idle(se))
- thresh = sysctl_sched_min_granularity;
- else
- thresh = sysctl_sched_latency;
-
- /*
- * Halve their sleep time's effect, to allow
- * for a gentler effect of sleepers:
- */
- if (sched_feat(GENTLE_FAIR_SLEEPERS))
- thresh >>= 1;
-
- vruntime -= thresh;
- }
+ if (!initial)
+ vruntime -= sched_sleeper_credit(se);

/*
* Pull vruntime of the entity being placed to the base level of
@@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
* inversed due to s64 overflow.
*/
sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
- if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
+ if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC)
se->vruntime = vruntime;
else
se->vruntime = max_vruntime(se->vruntime, vruntime);
@@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
*/
if (READ_ONCE(p->__state) == TASK_WAKING) {
struct cfs_rq *cfs_rq = cfs_rq_of(se);
+ u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;

- se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
+ if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
+ se->vruntime = -sched_sleeper_credit(se);
+ else
+ se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
}

if (!task_on_rq_migrating(p)) {



Thanks.
Zhang Qiao.

>
>>
>> Thanks,
>> Roman.
>>
>>
>>
>> Amazon Development Center Germany GmbH
>> Krausenstr. 38
>> 10117 Berlin
>> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
>> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
>> Sitz: Berlin
>> Ust-ID: DE 289 237 879
>>
>>
>>
> .
>