Re: [PATCH v2 0/6] arm/arm64: Allow the rescheduling IPI to bypass irq_enter/exit

From: Abhijeet Dharmapurikar
Date: Fri Jun 18 2021 - 15:31:01 EST


Hello All,

We are seeing significant improvements in time it takes for a task to be woken up on an idle cpu with these patches.

A trace output without
<<< 96uS total cost: cpu 1 wakes up rt-app task on cpu 2 >>>
          rt-app-955     [001]    149.387611: sched_wakeup_new: comm=rt-app pid=957 prio=120 target_cpu=002
          rt-app-955     [001]    149.387616: ipi_raise: target_mask=00000000,00000004 (Rescheduling interrupts)
          <idle>-0       [002]    149.387622: cpu_idle: state=4294967295 cpu_id=2
          <idle>-0       [002]    149.387640: irq_handler_entry: irq=1 name=IPI
          <idle>-0       [002]    149.387643: ipi_entry: (Rescheduling interrupts)
          <idle>-0       [002]    149.387646: ipi_exit: (Rescheduling interrupts)
          <idle>-0       [002]    149.387648: irq_handler_exit: irq=1 ret=handled
          <idle>-0       [002]    149.387707: sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rt-app next_pid=957 next_prio=120

With the patches.
<<< 68uS total cost: cpu 1 wakes up T0 on cpu 3 >>>
          rt-app-956     [001]     28.034953: sched_wakeup_new: comm=rt-app pid=958 prio=120 target_cpu=003
          rt-app-956     [001]     28.034958: ipi_raise: target_mask=00000000,00000008 (Rescheduling interrupts)
          <idle>-0       [003]     28.034964: cpu_idle: state=4294967295 cpu_id=3
          <idle>-0       [003]     28.034970: irq_handler_entry: irq=1 name=IPI
          <idle>-0       [003]     28.034974: ipi_entry: (Rescheduling interrupts)
          <idle>-0       [003]     28.034977: ipi_exit: (Rescheduling interrupts)
          <idle>-0       [003]     28.034979: irq_handler_exit: irq=1 ret=handled
          <idle>-0       [003]     28.035021: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rt-app next_pid=958 next_prio=120

This was taken on a snapdragon device similar to 8350.  This patch series helps in reducing the load time on idle cpus and thereby increase performance KPIs on various benchmarks.

Sent this data in hopes that we resurrect the discussion and get these fixes in.

Thanks,
Abhijeet

On 11/24/2020 6:14 AM, Marc Zyngier wrote:
This is the second version of my earlier series [1], which aims at
fixing (or papering over, depending on how you look at things) a
performance regression seen on arm64 for reched IPI heavy workloads
(such as "perf bench sched pipe").

As eloquently described by Thomas in his earlier replies [2], the
current situation is less than ideal on most architecture except x86,
and my conclusion is that what was broken in 5.9 wouldn't be more
broken in 5.10 with these patches (and addresses the performance
regression).

Needless to say, I intend to try and help fixing the issues Thomas
mentioned, and I believe that Mark (cc'd) already has something that
could be used as a healthy starting point (Mark, do correct me if I
misrepresented your work).

Thanks,

M.

* From v1:
- Added a new __irq_modify_status() helper
- Renamed IRQ_NAKED to IRQ_RAW
- Renamed IRQ_HIDDEN to IRQ_IPI
- Applied the same workaround to 32bit ARM for completeness

[1] https://lore.kernel.org/r/20201101131430.257038-1-maz@xxxxxxxxxx/
[2] https://lore.kernel.org/r/87lfewnmdz.fsf@xxxxxxxxxxxxxxxxxxxxxxx/

Marc Zyngier (6):
genirq: Add __irq_modify_status() helper to clear/set special flags
genirq: Allow an interrupt to be marked as 'raw'
arm64: Mark the recheduling IPI as raw interrupt
arm: Mark the recheduling IPI as raw interrupt
genirq: Drop IRQ_HIDDEN from IRQF_MODIFY_MASK
genirq: Rename IRQ_HIDDEN to IRQ_IPI

arch/arm/Kconfig | 1 +
arch/arm/kernel/smp.c | 6 +++++-
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/smp.c | 6 +++++-
include/linux/irq.h | 11 ++++++++---
kernel/irq/Kconfig | 3 +++
kernel/irq/chip.c | 12 ++++++++++--
kernel/irq/debugfs.c | 3 ++-
kernel/irq/irqdesc.c | 17 ++++++++++++-----
kernel/irq/proc.c | 2 +-
kernel/irq/settings.h | 33 +++++++++++++++++++++++++++------
11 files changed, 75 insertions(+), 20 deletions(-)