RE: [PATCH 1/4] softirq: implement IRQ flood detection mechanism

From: Long Li
Date: Fri Sep 20 2019 - 15:12:14 EST


> >> Long, does this patch make any difference?
> >
> > Sagi,
> >
> > Sorry it took a while to bring my system back online.
> >
> > With the patch, the IOPS is about the same drop with the 1st patch. I think
> the excessive context switches are causing the drop in IOPS.
> >
> > The following are captured by "perf sched record" for 30 seconds during
> tests.
> >
> > "perf sched latency"
> > With patch:
> > fio:(82) | 937632.706 ms | 1782255 | avg: 0.209 ms | max: 63.123
> ms | max at: 768.274023 s
> >
> > without patch:
> > fio:(82) |2348323.432 ms | 18848 | avg: 0.295 ms | max: 28.446
> ms | max at: 6447.310255 s
>
> Without patch means the proposed hard-irq patch?

It means the current upstream code without any patch. But It's prone to soft lockup.

Ming's proposed hard-irq patch gets similar results to "without patch", however it fixes the soft lockup.

>
> If we are context switching too much, it means the soft-irq operation is not
> efficient, not necessarily the fact that the completion path is running in soft-
> irq..
>
> Is your kernel compiled with full preemption or voluntary preemption?

The tests are based on Ubuntu 18.04 kernel configuration. Here are the parameters:

# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set

>
> > Look closer at each CPU, we can see ksoftirqd is competing CPU with
> > fio (and effectively throttle other fio processes) (captured in
> > /sys/kernel/debug/tracing, echo sched:* >set_event)
> >
> > On CPU1 with patch: (note that the prev_state for fio is "R", it's
> preemptively scheduled)
> > <...>-4077 [001] d... 66456.805062: sched_switch: prev_comm=fio
> prev_pid=4077 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/1
> next_pid=17 next_prio=120
> > <...>-17 [001] d... 66456.805859: sched_switch:
> prev_comm=ksoftirqd/1 prev_pid=17 prev_prio=120 prev_state=S ==>
> next_comm=fio next_pid=4077 next_prio=120
> > <...>-4077 [001] d... 66456.844049: sched_switch: prev_comm=fio
> prev_pid=4077 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/1
> next_pid=17 next_prio=120
> > <...>-17 [001] d... 66456.844607: sched_switch:
> prev_comm=ksoftirqd/1 prev_pid=17 prev_prio=120 prev_state=S ==>
> next_comm=fio next_pid=4077 next_prio=120
> >
> > On CPU1 without patch: (the prev_state for fio is "S", it's voluntarily
> scheduled)
> > <idle>-0 [001] d... 6725.392308: sched_switch:
> prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==>
> next_comm=fio next_pid=14342 next_prio=120
> > fio-14342 [001] d... 6725.392332: sched_switch: prev_comm=fio
> prev_pid=14342 prev_prio=120 prev_state=S ==> next_comm=swapper/1
> next_pid=0 next_prio=120
> > <idle>-0 [001] d... 6725.392356: sched_switch:
> prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==>
> next_comm=fio next_pid=14342 next_prio=120
> > fio-14342 [001] d... 6725.392425: sched_switch:
> > prev_comm=fio prev_pid=14342 prev_prio=120 prev_state=S ==>
> > next_comm=swapper/1 next_pid=0 next_prio=12