Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)

From: Frederic Weisbecker
Date: Mon Mar 16 2009 - 16:07:55 EST


On Mon, Mar 16, 2009 at 11:16:35PM +1030, Kevin Shanahan wrote:
> On Mon, 2009-03-16 at 11:49 +0200, Avi Kivity wrote:
> > Kevin Shanahan wrote:
> > > On Sat, 2009-03-14 at 20:20 +0100, Rafael J. Wysocki wrote:
> > >
> > >> This message has been generated automatically as a part of a report
> > >> of regressions introduced between 2.6.27 and 2.6.28.
> > >>
> > >> The following bug entry is on the current list of known regressions
> > >> introduced between 2.6.27 and 2.6.28. Please verify if it still should
> > >> be listed and let me know (either way).
> > >>
> > >> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12465
> > >> Subject : KVM guests stalling on 2.6.28 (bisected)
> > >> Submitter : Kevin Shanahan <kmshanah@xxxxxxxxxxx>
> > >> Date : 2009-01-17 03:37 (57 days old)
> > >> Handled-By : Avi Kivity <avi@xxxxxxxxxx>
> > >>
> > >
> > > No further updates since the last reminder.
> > > The bug should still be listed.
> >
> > Does the bug reproduce if you use the acpi_pm clocksource in the guests?
>
> In the guest being pinged? Yes, it still happens.


Hi Kevin,

I've looked a bit at your traces.
I think it's probably too wide to find something inside.
Latest -tip is provided with a new set of events tracing, meaning
that you will be able to produce function graph traces with various
sched events included.

Another thing, is it possible to reproduce it with only one ping?
Or testing perioding pings and keep only one that raised a relevant
threshold of latency? I think we could do a script that can do that.
It would make the trace much clearer.

Just wait a bit, I'm looking at which event could be relevant to enable
and I come back to you with a set of commands to test.

Frederic.

> hermes-old:~# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> kvm-clock acpi_pm jiffies tsc
> hermes-old:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> acpi_pm
>
> kmshanah@flexo:~$ ping -c 600 hermes-old
>
> --- hermes-old.wumi.org.au ping statistics ---
> 600 packets transmitted, 600 received, 0% packet loss, time 599439ms
> rtt min/avg/max/mdev = 0.131/723.197/9941.884/1569.918 ms, pipe 10
>
> I had to reconfigure the guest kernel to make that clocksource
> available. The way I had the guest kernel configured before, it only had
> tsc and jiffies clocksources available. Unstable TSC was detected, so it
> has been using jiffies until now.
>
> Here's another test, using kvm-clock as the guest's clocksource:
>
> hermes-old:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> kvm-clock
>
> kmshanah@flexo:~$ ping -c 600 hermes-old
>
> --- hermes-old.wumi.org.au ping statistics ---
> 600 packets transmitted, 600 received, 0% packet loss, time 599295ms
> rtt min/avg/max/mdev = 0.131/1116.170/30840.411/4171.905 ms, pipe 31
>
> Regards,
> Kevin.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/