Re: [RFC PATCH v2 0/5] Idle Load Balance fixes and softirq enhancements

From: K Prateek Nayak
Date: Wed Sep 04 2024 - 09:52:59 EST


On 9/4/2024 4:42 PM, K Prateek Nayak wrote:
Hello folks,

[..snip..]

Chenyu had reported a regression when running a modified version of
ipistorm that performs a fixed set of IPIs between two CPUs on his
setup with the whole v1 applied. I've benchmarked this series on both an
AMD and an Intel system to catch any significant regression early.
Following are the numbers from a dual socket Intel Ice Lake Xeon server
(2 x 32C/64T) and 3rd Generation AMD EPYC system (2 x 64C/128T) running
ipistorm between CPU8 and CPU16 (unless stated otherwise with *):

base: tip/master at commit 5566819aeba0 ("Merge branch into tip/master:
'x86/timers'") based on v6.11-rc6 + Patch from [1]

So that should have been the SM_IDLE fast path patch in [3]
https://lore.kernel.org/lkml/20240809092240.6921-1-kprateek.nayak@xxxxxxx/


==================================================================
Test : ipistorm (modified)
Units : % improvement over base kernel
Interpretation: Higher is better
======================= Intel Ice Lake Xeon ======================
kernel: [pct imp]
performance gov, boost on -3%
powersave gov, boost on -2%
performance gov, boost off -3%
performance gov, boost off, cross node * -3%
==================== 3rd Generation AMD EPYC =====================
kernel: [pct imp]
performance gov, boost on, !PREEMPT_RT 36%
performance gov, boost on, PREEMPT_RT 54%
==================================================================

PREEMPT_RT kernel is based on:

git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git linux-6.11.y-rt-rebase

at commit 01ab72c93f63 ("Add localversion for -RT release") with the
addition of commit e68ac2b48849 ("softirq: Remove unused 'action'
parameter from action callback") from tip:irq/core and the SM_IDLE
fast-path patch from [3].


* cross node setup used CPU 16 on Node 0 and CPU 17 on Node 1 on the
dual socket Intel Ice Lake Xeon system.

Improvements on PREEMPT_RT can perhaps be attributed to cacheline
aligning the per-cpu softirq_ctrl variable.

This series has been marked RFC since this is my first attempt at
dealing with PREEMPT_RT nuances. Any and all feedback is appreciated.

[1] https://lore.kernel.org/lkml/20240710090210.41856-1-kprateek.nayak@xxxxxxx/
[2] https://lore.kernel.org/lkml/fcf823f-195e-6c9a-eac3-25f870cb35ac@xxxxxxxx/
[3] https://lore.kernel.org/lkml/20240809092240.6921-1-kprateek.nayak@xxxxxxx/
[4] https://lore.kernel.org/lkml/225e6d74-ed43-51dd-d1aa-c75c86dd58eb@xxxxxxx/
[5] https://lore.kernel.org/lkml/20240710150557.GB27299@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
---
[..snip..]


--
Thanks and Regards,
Prateek