Re: [RFC PATCH v4 3/3] softirq: defer softirq processing to ksoftirqd if CPU is busy with RT

From: Qais Yousef
Date: Mon Oct 10 2022 - 12:09:27 EST


Hi John

On 10/03/22 23:20, John Stultz wrote:
> From: Pavankumar Kondeti <pkondeti@xxxxxxxxxxxxxx>
>
> Defer the softirq processing to ksoftirqd if a RT task is
> running or queued on the current CPU. This complements the RT
> task placement algorithm which tries to find a CPU that is not
> currently busy with softirqs.
>
> Currently NET_TX, NET_RX, BLOCK and IRQ_POLL softirqs are only
> deferred as they can potentially run for long time.
>
> Additionally, this patch stubs out ksoftirqd_running() logic,
> in the CONFIG_RT_SOFTIRQ_OPTIMIZATION case, as deferring
> potentially long-running softirqs will cause the logic to not
> process shorter-running softirqs immediately. By stubbing it out
> the potentially long running softirqs are deferred, but the
> shorter running ones can still run immediately.

The cover letter didn't make it to my inbox (nor to others AFAICS from lore),
so replying here.

The series looks good to me. It offers a compromise to avoid an existing
conflict between RT and softirqs without disrupting much how both inherently
work. I guess it's up to the maintainers to decide if this direction is
acceptable or not.

I've seen Thomas had a comment on another softirq series (which attempts to
throttle them ;-) by the way that is worth looking it:

https://lore.kernel.org/lkml/877d81jc13.ffs@tglx/


Meanwhile, I did run a few tests on a test laptop that has 2 core SMT2 i7
laptop (since I assume you tested on Android)

I set priority to 1 for all of these cyclic tests.

First I ran without applying your patch to fix the affinity problem in
cyclictest:

I had a 1 hour run of 4 threads - 4 iperf threads and 4 dd threads are
doing work in the background:

| vanilla | with softirq patches v4 |
-------------------|-----------------|-------------------------|
t0 max delay (us) | 6728 | 2096 |
t1 max delay (us) | 2545 | 1990 |
t2 max delay (us) | 2282 | 2094 |
t3 max delay (us) | 6038 | 2162 |

Which shows max latency is improved a lot. Though because I missed applying
your cyclictest patch, I believe this can be attributed only to patch 3 which
defers the softirq if there's current task is an RT one.


I applied your patch to cyclictest to NOT force affinity when specifying -t
option.



Ran cyclictest for 4 hours, -t 3, 3 iperf threads and 3 dd threads running in
the background:

| vanilla | with softirq patches v4 |
-------------------|-----------------|-------------------------|
t0 max delay (us) | 2656 | 2164 |
t1 max delay (us) | 2773 | 1982 |
t2 max delay (us) | 2272 | 2278 |

I didn't hit a big max delay on this case. It shows things are better still.



Ran another cyclictest for 4 hours, -t 4, 4 iperf threads and 4 dd threads in
the background:

| vanilla | with softirq patches v4 |
-------------------|-----------------|-------------------------|
t0 max delay (us) | 4012 | 7074 |
t1 max delay (us) | 2460 | 9088 |
t2 max delay (us) | 3755 | 2784 |
t3 max delay (us) | 4392 | 2295 |

Here the results worryingly show that applying the patches make things much
worse.

I still have to investigate why. I'll have another run to see if the results
look different, then try to dig more.

All results are from the cyclictest json dump.


Cheers

--
Qais Yousef