Re: [RFC PATCH v3 00/16] Core scheduling v3

From: Aubrey Li
Date: Tue Aug 27 2019 - 19:24:21 EST


On Wed, Aug 28, 2019 at 5:14 AM Matthew Garrett <mjg59@xxxxxxxxxxxxx> wrote:
>
> Apple have provided a sysctl that allows applications to indicate that
> specific threads should make use of core isolation while allowing
> the rest of the system to make use of SMT, and browsers (Safari, Firefox
> and Chrome, at least) are now making use of this. Trying to do something
> similar using cgroups seems a bit awkward. Would something like this be
> reasonable? Having spoken to the Chrome team, I believe that the
> semantics we want are:
>
> 1) A thread to be able to indicate that it should not run on the same
> core as anything not in posession of the same cookie
> 2) Descendents of that thread to (by default) have the same cookie
> 3) No other thread be able to obtain the same cookie
> 4) Threads not be able to rejoin the global group (ie, threads can
> segregate themselves from their parent and peers, but can never rejoin
> that group once segregated)
>
> but don't know if that's what everyone else would want.
>
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index 094bb03b9cc2..5d411246d4d5 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -229,4 +229,5 @@ struct prctl_mm_map {
> # define PR_PAC_APDBKEY (1UL << 3)
> # define PR_PAC_APGAKEY (1UL << 4)
>
> +#define PR_CORE_ISOLATE 55
> #endif /* _LINUX_PRCTL_H */
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 12df0e5434b8..a054cfcca511 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2486,6 +2486,13 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
> return -EINVAL;
> error = PAC_RESET_KEYS(me, arg2);
> break;
> + case PR_CORE_ISOLATE:
> +#ifdef CONFIG_SCHED_CORE
> + current->core_cookie = (unsigned long)current;

Because AVX512 instructions could pull down the core frequency,
we also want to give a magic cookie number to all AVX512-using
tasks on the system, so they won't affect the performance/latency
of any other tasks.

This could be done by putting all AVX512 tasks into a cgroup, or
by AVX512 detection the following patch introduced.

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=2f7726f955572e587d5f50fbe9b2deed5334bd90

Thanks,
-Aubrey