Re: [PATCH 0/8] x86, sched: Dynamic ITMT core ranking support and some yak shaving

From: K Prateek Nayak
Date: Thu Dec 12 2024 - 23:13:19 EST


Hello Tim,

On 12/13/2024 6:03 AM, Tim Chen wrote:
On Wed, 2024-12-11 at 18:55 +0000, K Prateek Nayak wrote:
The ITMT infrastructure currently assumes ITMT rankings to be static and
is set correctly prior to enabling ITMT support which allows the CPU
with the highest core ranking to be cached as the "asym_prefer_cpu" in
the sched_group struct. However, with the introduction of Preferred Core
support in amd-pstate, these rankings can change at runtime.

This series adds support for dynamic ranking in generic scheduler layer
without the need to rebuild the sched domain hierarchy and fixes an
issue with x86_die_flags() on AMD systems that support Preferred Core
ranking with some yak shaving done along the way.

Patch 1 to 4 are independent cleanup around ITMT infrastructure, removal
of x86_smt_flags wrapper, and moving the "sched_itmt_enabled" sysctl to
debugfs.

Patch 5 adds the SD_ASYM_PACKING flag to the PKG domain on all ITMT
enabled systems. The rationale behind the addition is elaborates in the
same. One open question remains is for Intel processors with multiple
Tiles in a PKG which advertises itself as multiple LLCs in a PKG and
supports ITMT - is it okay to set SD_ASYM_PACKING for PKG domain on
these processors?

After talking to my colleagues Ricardo and Srinivas, we think that this
should be fine for Intel CPUs.

Thank you for confirming that. Could you also confirm if my observations
for Intel systems on Patch 5 covered all possible scenarios for the ones
that feature multiple MC groups within a PKG and enable ITMT support. If
I'm missing something, please do let me know and we can hash out the
implementation details.

Thanks a ton for reviewing the series!

--
Thanks and Regards,
Prateek


Tim


Patch 6 and 7 are independent possible micro-optimizations discovered
when auditing update_sg_lb_stats()

Patch 8 uncaches the asym_prefer_cpu from the sched_group struct and
finds it during load balancing in update_sg_lb_stats() before it is used
to make any scheduling decisions. This is the simplest approach; an
alternate approach would be to move the asym_prefer_cpu to
sched_domain_shared and allow the first load balancing instance post a
priority change to update the cached asym_prefer_cpu. On systems with
static priorities, this would allow benefits of caching while on systems
with dynamic priorities, it'll reduce the overhead of finding
"asym_prefer_cpu" each time update_sg_lb_stats() is called however the
benefits come with added code complexity which is why Patch 8 is marked
as an RFC.

[..snip..]