Re: [PATCH v3 12/12] KVM: arm64: selftests: arch_timer: Support vCPU migration

From: Andrew Jones
Date: Mon Sep 06 2021 - 02:39:18 EST


On Fri, Sep 03, 2021 at 01:53:27PM -0700, Raghavendra Rao Ananta wrote:
> On Fri, Sep 3, 2021 at 4:05 AM Andrew Jones <drjones@xxxxxxxxxx> wrote:
> >
> > On Wed, Sep 01, 2021 at 09:14:12PM +0000, Raghavendra Rao Ananta wrote:
> > > Since the timer stack (hardware and KVM) is per-CPU, there
> > > are potential chances for races to occur when the scheduler
> > > decides to migrate a vCPU thread to a different physical CPU.
> > > Hence, include an option to stress-test this part as well by
> > > forcing the vCPUs to migrate across physical CPUs in the
> > > system at a particular rate.
> > >
> > > Originally, the bug for the fix with commit 3134cc8beb69d0d
> > > ("KVM: arm64: vgic: Resample HW pending state on deactivation")
> > > was discovered using arch_timer test with vCPU migrations and
> > > can be easily reproduced.
> > >
> > > Signed-off-by: Raghavendra Rao Ananta <rananta@xxxxxxxxxx>
> > > ---
> > > .../selftests/kvm/aarch64/arch_timer.c | 108 +++++++++++++++++-
> > > 1 file changed, 107 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/testing/selftests/kvm/aarch64/arch_timer.c b/tools/testing/selftests/kvm/aarch64/arch_timer.c
> > > index 1383f33850e9..de246c7afab2 100644
> > > --- a/tools/testing/selftests/kvm/aarch64/arch_timer.c
> > > +++ b/tools/testing/selftests/kvm/aarch64/arch_timer.c
> > > @@ -14,6 +14,8 @@
> > > *
> > > * The test provides command-line options to configure the timer's
> > > * period (-p), number of vCPUs (-n), and iterations per stage (-i).
> > > + * To stress-test the timer stack even more, an option to migrate the
> > > + * vCPUs across pCPUs (-m), at a particular rate, is also provided.
> > > *
> > > * Copyright (c) 2021, Google LLC.
> > > */
> > > @@ -24,6 +26,8 @@
> > > #include <pthread.h>
> > > #include <linux/kvm.h>
> > > #include <linux/sizes.h>
> > > +#include <linux/bitmap.h>
> > > +#include <sys/sysinfo.h>
> > >
> > > #include "kvm_util.h"
> > > #include "processor.h"
> > > @@ -41,12 +45,14 @@ struct test_args {
> > > int nr_vcpus;
> > > int nr_iter;
> > > int timer_period_ms;
> > > + int migration_freq_ms;
> > > };
> > >
> > > static struct test_args test_args = {
> > > .nr_vcpus = NR_VCPUS_DEF,
> > > .nr_iter = NR_TEST_ITERS_DEF,
> > > .timer_period_ms = TIMER_TEST_PERIOD_MS_DEF,
> > > + .migration_freq_ms = 0, /* Turn off migrations by default */
> >
> > I'd rather we enable good tests like these by default.
> >
> Well, that was my original idea, but I was concerned about the ease
> for diagnosing
> things since it'll become too noisy. And so I let it as a personal
> preference. But I can
> include it back and see how it goes.

Break the tests into two? One run without migration and one with. If
neither work, then we can diagnose the one without first, possibly
avoiding the need to diagnose the one with.

Thanks,
drew