Re: [RFC PATCH 1/2] sched: Rate limit migrations to 1 per 2ms per task

From: Tim Chen
Date: Wed Sep 06 2023 - 16:53:28 EST


On Wed, 2023-09-06 at 11:47 +0200, Peter Zijlstra wrote:
> On Tue, Sep 05, 2023 at 03:44:57PM -0700, Tim Chen wrote:
>
> > Reading up on sched_clock() documentation and seems like it should 
> > indeed be monotonic.
>
> It tries very hard to be monotonic but cannot guarantee. The moment TSC
> is found unstable it's too late to fix up everything.
>

Yes, if TSC becomes unstable and could cause sched_clock to reset and go way backward.
Perhaps we can add the following check in Mathieu's original
patch to fix things up:

+static bool should_migrate_task(struct task_struct *p, int prev_cpu)
> +{
/* sched_clock reset causing next migration time to be too far ahead */
if (p->se.next_migration_time > sched_clock_cpu(prev_cpu) + SCHED_MIGRATION_RATELIMIT_WINDOW)
p->se.next_migration_time = sched_clock_cpu(prev_cpu) + SCHED_MIGRATION_RATELIMIT_WINDOW;

> + /* Rate limit task migration. */
> + if (sched_clock_cpu(prev_cpu) < p->se.next_migration_time)
> + return false;
> + return true;
> +}
> +

Tim