Re: soft lockup detector & virtualisation

From: Eric B Munson
Date: Thu Feb 16 2012 - 20:57:44 EST


On Thu, 16 Feb 2012 17:39:38 -0800, john stultz wrote:
On Thu, Feb 16, 2012 at 3:15 PM, Dave Jones <davej@xxxxxxxxxx> wrote:
Lately I've noticed quite a few soft lockup bugs being reported.
In many of them, they're coming from inside virtual guests.

Is the softlockup detector fundamentally broken in this situation ?

If the host doesn't schedule the guest for whatever reason,
or the user suspends the VM and resumes it later ?

Here's the most recent example:
https://bugzilla.redhat.com/attachment.cgi?id=563767

In many of these, the code where it's "stuck" isn't anything
special, which is why I think the guest just hasn't had a
timeslice in 185 seconds.

Is there some way we can perhaps detect we're running virtualised,
and disable the detector automatically ?

I think Eric's work (See "Add check for suspended vm in softlockup
detector" sent out today) tries to address this issue.

thanks
-john


The work I have been doing specifically handles the case where the hypervisor suspends the guest. There is talk of extending that work to handle preemption as well, which I think will cover your use case.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/