Re: [PATCH 4/4] sched/fair: Proportional newidle balance
From: Mohamed Abuelfotoh, Hazem
Date: Fri Jan 30 2026 - 08:17:38 EST
On 29/01/2026 09:19, Peter Zijlstra wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
On Wed, Jan 28, 2026 at 03:48:13PM +0000, Mohamed Abuelfotoh, Hazem wrote:
Below are the hardware specs for both c7i.4xlarge & c7a.4xlarge.
c7i.4xlarge
CPU Model: Intel(R) Xeon(R) Platinum 8488C
Number of CPUs: 16
Memory: 32 GB
Number of sockets: 1
But the 8488C is a 56 core part, with 112 threads. So you're handing out
8 core partitions of that thing, for 7 such instances on one machine?
c7i.4xlarge is an EC2 instance which is basically a virtual machine running on Nitro KVM based hypervisor. The VM is sharing the host with other VMs which explain why Amazon doesn't allocate all the host CPU resources to a single VM. There are larger EC2 instance sizes where a single VM would occupy the whole host for example c7i.48xlarge which has 192 vCPUs. Your conclusion is right c7i.4xlarge has 8 Physical cores with HT enabled which adds up to 16 vCPU.
(Also, calling anything 16 core with 32GB 'large' is laughable, that's
laptop territory.)
-------------------------------------------------------------------------
c7a.4xlarge
CPU Model: AMD EPYC 9R14
Number of CPUs: 16
Memory: 32 GB
Number of sockets: 1
And that 9r14 is a 96 core part, 12 CCDs, 8 cores each. So you're again
handing out partitions of that.
For both cases, are these partitions fixed? Specifically in the AMD case,
are you handing out exactly 1 CCDs per partition?
Because if so, I'm mighty confused by the results. 8 cores, 16 threads
is exactly one CCD worth of Zen4 and should therefore be a single L3 and
behave exactly like the Intel thing.
Something is missing here.
The main difference between Intel based c7i.4xlarge vs AMD based c7a.4xlarge is that on Intel we have HT enabled so the instance has 16 vCPU which are really 8 Physical cores with HT enabled. On AMD the VM comes with 16 physical cores with no HT so it has 2 CCDs while on Intel we have a single L3 cache. I am also adding the output of lscpu on both instances to clarify architectural discrepancies between both.
**c7i.4xlarge**
# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: GenuineIntel
BIOS Vendor ID: Intel(R) Corporation
Model name: Intel(R) Xeon(R) Platinum 8488C
BIOS Model name: Intel(R) Xeon(R) Platinum 8488C
CPU family: 6
Model: 143
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 8
BogoMIPS: 4800.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq monitor ss
se3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpc
id avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx_vnni avx512_bf16 wbnoinvd ida arat avx512vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx5
12_bitalg tme avx512_vpopcntdq rdpid cldemote movdiri movdir64b md_clear serialize amx_bf16 avx512_fp16 amx_tile amx_int8 flush_l1d arch_capabilities
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 384 KiB (8 instances)
L1i: 256 KiB (8 instances)
L2: 16 MiB (8 instances)
L3: 105 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15
-------------------------------------------------------------------------
**c7a.4xlarge**
# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: AuthenticAMD
BIOS Vendor ID: Advanced Micro Devices, Inc.
Model name: AMD EPYC 9R14
BIOS Model name: AMD EPYC 9R14
CPU family: 25
Model: 17
Thread(s) per core: 1
Core(s) per socket: 16
Socket(s): 1
Stepping: 1
BogoMIPS: 5199.99
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq monitor
ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext perfctr_core invpcid_single ssbd perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgs
base bmi1 avx2 smep bmi2 invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr rdpru wbnoinvd arat avx512vbmi pku ospke avx512_vbmi2 gfni vaes
vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid flush_l1d
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 512 KiB (16 instances)
L1i: 512 KiB (16 instances)
L2: 16 MiB (16 instances)
L3: 64 MiB (2 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15