Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

From: Thomas Gleixner
Date: Sat Sep 16 2017 - 13:40:21 EST


On Sat, 16 Sep 2017, Fengguang Wu wrote:
> > > [ 0.038086] Performance Events: unsupported p6 CPU model 61 no PMU
> > > driver, software events only.
>
> What's your host CPU? I can reproduce it in Nehalem, Haswell and Sandy
> Bridge machines with the attached script.

My bad. I booted the wrong config ....

> > > [ 0.041031] Hierarchical SRCU implementation.
> > > [ 0.046210] NMI watchdog: Perf event create on CPU 0 failed with -2
> > > [ 0.046980] NMI watchdog: Perf NMI watchdog permanetely disabled
> > >
> > > Confused
> >
> > I still can't reproduce. Can you please apply the debug patch below and
> > provide the output?
>
> OK. I'll try and report back tomorrow.

Don't bother. I found it already. On UP we have:

#define for_each_cpu(cpu, mask) \
for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)

which is a total fail as it breaks any code which uses for_each_cpu() or
any of the other variants on UP by assuming that all cpumask have bit 0
set.

That means any code which does not have conditional code for some of the
cpumask functions is potentially broken. Sigh.

The simple cure for the watchdog is below.

Thanks,

tglx
8<------------------

diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index b2931154b5f2..d4c0f75b189e 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -221,7 +221,12 @@ void hardlockup_detector_perf_cleanup(void)
struct perf_event *event = per_cpu(watchdog_ev, cpu);

per_cpu(watchdog_ev, cpu) = NULL;
- perf_event_release_kernel(event);
+ /*
+ * Check the event, because on UP for_each_cpu() assumes
+ * idiotically that all masks handed in have bit 0 set.
+ */
+ if (event)
+ perf_event_release_kernel(event);
}
cpumask_clear(&dead_events_mask);
}