kernel 5.2+: suspend freeze in VMware Player.
From: Woody Suwalski
Date: Sat Nov 23 2019 - 17:51:25 EST
Rafael, Thomas, this is the same VMware Player 15.2 freeze on suspend issue
I have been discussing with you in August.
It has surfaced after Thomas Gleixner's change in kernel 5.2
dfe0cf8b x86/ioapic: Implement irq_get irqchip_state() callback
It is still with us in 5.4, 100% repeatable on a second suspend after a
reboot.
I have traced it down to the ioapic_irq_get_chip_state() function, where
rentry.rr is stuck hi.
On the first suspend I can see that for IRQ9 the test exits with irr=0,
trigger=1, but on second and consecutive suspends it is returning
irr=1 trigger=1, so *state=1, and this results in a never-ending loop
in __synchronize_hardirq(), because inprogress is always 1.
I have been usig a "fix" to timeout in __synchronize_hardirq() after
64 iterations, and that seems to work OK (no side-effects noticed),
but of course is not addressing the underlying problem.
And the problem may be somewhere in VMware emulation code, returning bad
data?
Would you have ideas as to what should be the right setting for
IRQ9 in VM environment? Edge or level?
And which part of code is reading the "hardware" state from VMware?
OTOH, current implementation is not really safe, as the wait loop should
have
a timeout, or else it may get stuck. Should I provide my safety-exit patch?
Thanks, Woody