Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208

From: Fengguang Wu
Date: Sat Sep 16 2017 - 08:47:02 EST


On Fri, Sep 15, 2017 at 06:24:20PM +0200, Thomas Gleixner wrote:
On Fri, 15 Sep 2017, Thomas Gleixner wrote:

On Fri, 15 Sep 2017, Thomas Gleixner wrote:

> On Fri, 15 Sep 2017, kernel test robot wrote:
> > [ 0.035023] CPU: Intel Common KVM processor (family: 0xf, model: 0x6, stepping: 0x1)
> > [ 0.042302] Performance Events: unsupported Netburst CPU model 6 no PMU driver, software events only.
>
> Cute. So there is no supported PMU, but for some unknown reason the lockup
> detector can create an event, otherwise the perf availaibility check in
> lockup_detector_init() would fail ....
>
> Peter???

In my VM the corresponding dmesg is:

[ 0.038086] Performance Events: unsupported p6 CPU model 61 no PMU driver, software events only.

What's your host CPU? I can reproduce it in Nehalem, Haswell and Sandy
Bridge machines with the attached script.

[ 0.041031] Hierarchical SRCU implementation.
[ 0.046210] NMI watchdog: Perf event create on CPU 0 failed with -2
[ 0.046980] NMI watchdog: Perf NMI watchdog permanetely disabled

Confused

I still can't reproduce. Can you please apply the debug patch below and
provide the output?

OK. I'll try and report back tomorrow.

Thanks,
Fengguang

8<-----------------

diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index b2931154b5f2..e6c9ca516945 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -171,6 +171,7 @@ static int hardlockup_detector_event_create(void)
/* Try to register using hardware perf events */
evt = perf_event_create_kernel_counter(wd_attr, cpu, NULL,
watchdog_overflow_callback, NULL);
+ pr_info("EVT create on CPU %u returned %p\n", cpu, evt);
if (IS_ERR(evt)) {
pr_info("Perf event create on CPU %d failed with %ld\n", cpu,
PTR_ERR(evt));
@@ -221,7 +222,10 @@ void hardlockup_detector_perf_cleanup(void)
struct perf_event *event = per_cpu(watchdog_ev, cpu);

per_cpu(watchdog_ev, cpu) = NULL;
- perf_event_release_kernel(event);
+ pr_info("EVT on CPU %u in dead mask: %p\n", cpu, event);
+ if (event)
+ perf_event_release_kernel(event);
+
}
cpumask_clear(&dead_events_mask);
}
#!/bin/bash

kernel=$1

kvm=(
qemu-system-x86_64
-enable-kvm
-cpu kvm64
-kernel $kernel
-m 399
-smp 2
-device e1000,netdev=net0
-netdev user,id=net0
-boot order=nc
-no-reboot
-watchdog i6300esb
-watchdog-action debug
-rtc base=localtime
-serial stdio
-display none
-monitor null
)

append=(
root=/dev/ram0
hung_task_panic=1
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
net.ifnames=0
printk.devkmsg=on
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
drbd.minor_count=8
systemd.log_level=err
ignore_loglevel
console=tty0
earlyprintk=ttyS0,115200
console=ttyS0,115200
vga=normal
rw
drbd.minor_count=8
)

"${kvm[@]}" -append "${append[*]}"