Re: [PATCHv5 2/3] watchdog/softlockup: report the most frequent interrupts

From: Bitao Hu
Date: Wed Feb 07 2024 - 01:19:11 EST


Hi,

On 2024/2/7 05:42, Doug Anderson wrote:
Hi,

On Tue, Feb 6, 2024 at 1:59 AM Bitao Hu <yaoma@xxxxxxxxxxxxxxxxx> wrote:

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 71d5b6dfa358..26dc1ad86276 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -18,6 +18,9 @@
#include <linux/init.h>
#include <linux/kernel_stat.h>
#include <linux/math64.h>
+#include <linux/irq.h>
+#include <linux/irqdesc.h>
+#include <linux/bitops.h>

These are still not sorted alphabetically. "irq.h" and "irqdesc.h"
should go between "init.h" and "kernel_stat.h". "bitops.h" is trickier
because the existing headers are not quite sorted. Probably the best
would be to fully sort them. They should end up like this:

#include <linux/bitops.h>
#include <linux/cpu.h>
#include <linux/init.h>
#include <linux/irq.h>
#include <linux/irqdesc.h>
#include <linux/kernel_stat.h>
#include <linux/kvm_para.h>
#include <linux/math64.h>
#include <linux/mm.h>
#include <linux/module.h>
#include <linux/nmi.h>
#include <linux/stop_machine.h>
#include <linux/sysctl.h>
#include <linux/tick.h>

#include <linux/sched/clock.h>
#include <linux/sched/debug.h>
#include <linux/sched/isolation.h>

#include <asm/irq_regs.h>

Sorry, I misunderstood your point, thinking that they should only be
added between "init.h" and "module.h". I will arrange them in the
alphabetical order as you suggested.


+static void start_counting_irqs(void)
+{
+ int i;
+ struct irq_desc *desc;
+ u32 *counts = __this_cpu_read(hardirq_counts);
+ int cpu = smp_processor_id();
+
+ if (!test_bit(cpu, softlockup_hardirq_cpus)) {

I don't think you need "softlockup_hardirq_cpus", do you? Just read
"actual_nr_irqs" and see if it's non-zero? ...or read "hardirq_counts"
and see if it's non-NULL?
Sure, the existing variables are sufficient for making a determination.
And may be I should swap it to make the decision logic here clearer,
like this (untested)?

bool is_counting_started(void)
{
return !!__this_cpu_read(hardirq_counts);
}

if (!is_counting_started()) {


+ /*
+ * nr_irqs has the potential to grow at runtime. We should read
+ * it and store locally to avoid array out-of-bounds access.
+ */
+ __this_cpu_write(actual_nr_irqs, nr_irqs);

nit: IMO store nr_irqs in a local variable to avoid all of the
"__this_cpu_read" calls everywhere. Then just write it once from your
local variable.
OK.


+ counts = kmalloc_array(__this_cpu_read(actual_nr_irqs),
+ sizeof(u32),
+ GFP_ATOMIC);

should use "kcalloc()" so the array is zeroed. That way if the set of
non-NULL "desc"s changes between calls you don't end up reading
uninitialized memory.
OK, I will use "kcalloc()" here.


+static void stop_counting_irqs(void)
+{
+ u32 *counts = __this_cpu_read(hardirq_counts);
+ int cpu = smp_processor_id();
+
+ if (test_bit(cpu, softlockup_hardirq_cpus)) {
+ kfree(counts);
+ counts = NULL;
+ __this_cpu_write(hardirq_counts, counts);

nit: don't really need to set the local "counts" to NULL. Just:

__this_cpu_write(hardirq_counts, NULL);

...and actually if you take my advice above and get rid of
"softlockup_hardirq_cpus" then this function just becomes:

kfree(__this_cpu_read(hardirq_counts));
__this_cpu_write(hardirq_counts, NULL);

Since kfree() handles when you pass it NULL...
OK.


+static void print_irq_counts(void)
+{
+ int i;
+ struct irq_desc *desc;
+ u32 counts_diff;
+ u32 *counts = __this_cpu_read(hardirq_counts);
+ int cpu = smp_processor_id();
+ struct irq_counts irq_counts_sorted[NUM_HARDIRQ_REPORT] = {
+ {-1, 0}, {-1, 0}, {-1, 0}, {-1, 0},
+ };
+
+ if (test_bit(cpu, softlockup_hardirq_cpus)) {
+ for_each_irq_desc(i, desc) {
+ if (!desc)
+ continue;

The "if" test above isn't needed. The "for_each_irq_desc()" macro
already checks for NULL.
Thanks for your reminder.



+ /*
+ * We need to bounds-check in case someone on a different CPU
+ * expanded nr_irqs.
+ */
+ if (i < __this_cpu_read(actual_nr_irqs))
+ counts_diff = desc->kstat_irqs ?
+ *this_cpu_ptr(desc->kstat_irqs) - counts[i] : 0;
+ else
+ counts_diff = desc->kstat_irqs ?
+ *this_cpu_ptr(desc->kstat_irqs) : 0;

Why do you need to test "kstat_irqs" for 0?
Although "alloc_desc" wil allocate both "desc" and "kstat_irqs" at the
same time, I refer to the usage of "kstat_irqs" in "show_interrupts"
from kernel/irq/proc.c, where it does perform a check.

kernel/irq/proc.c: show_interrupts
for_each_online_cpu(j)
seq_printf(p, "%10u ", desc->kstat_irqs ?
*per_cpu_ptr(desc->kstat_irqs, j) : 0);

I'm not sure why this is necessary, so I copied it as it is. Can we skip
the check in "print_irq_counts"?

duplicate the math. In other words, I'd expect this (untested):

if (i < __this_cpu_read(actual_nr_irqs))
count = counts[i];
else
count = 0;
counts_diff = *this_cpu_ptr(desc->kstat_irqs) - count;
Agree.

I guess I'd also put "__this_cpu_read(actual_nr_irqs)" in a local
variable like you do with counts...