setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms

From: Daniel Drake
Date: Thu Aug 01 2019 - 02:25:09 EST


Hi,

Working with a new consumer laptop based on AMD R7-3700U, we are
seeing a kernel panic during early boot (before the display
initializes). It's a new product and there is no previous known
working kernel version (tested 5.0, 5.2 and current linus master).

We may have also seen this problem on a MiniPC based on AMD APU 7010
from another vendor, but we don't have it in hands right now to
confirm that it's the exact same crash.

earlycon shows the details: a NULL dereference under
setup_boot_APIC_clock(), which actually happens in
calibrate_APIC_clock():

/* Replace the global interrupt handler */
real_handler = global_clock_event->event_handler;
global_clock_event->event_handler = lapic_cal_handler;

global_clock_event is NULL here. This is a "reduced hardware" ACPI
platform so acpi_generic_reduced_hw_init() has set timer_init to NULL,
avoiding the usual codepaths that would set up global_clock_event.

I tried the obvious:
if (!global_clock_event)
return -1;

However I'm probably missing part of the big picture here, as this
only makes boot fail later on. It continues til the next point that
something leads to schedule(), such as a driver calling msleep() or
mark_readonly() calling rcu_barrier(), etc. Then it hangs.

Is something missing in terms of timer setup here? Suggestions appreciated...

Thanks
Daniel