Re: [PATCH v2] KVM: selftests: Run dirty_log_perf_test on specific cpus
From: Vipin Sharma
Date: Fri Aug 19 2022 - 20:18:05 EST
On Fri, Aug 19, 2022 at 3:59 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Fri, Aug 19, 2022, David Matlack wrote:
> > On Fri, Aug 19, 2022 at 03:20:06PM -0700, Vipin Sharma wrote:
> > > On Fri, Aug 19, 2022 at 2:38 PM David Matlack <dmatlack@xxxxxxxxxx> wrote:
> > > > I think we should move all the logic to pin threads to perf_test_util.c.
> > > > The only thing dirty_log_perf_test.c should do is pass optarg into
> > > > perf_test_util.c. This will make it trivial for any other test based on
> > > > pef_test_util.c to also use pinning.
> > > >
> > > > e.g. All a test needs to do to use pinning is add a flag to the optlist
> > > > and add a case statement like:
> > > >
> > > > case 'c':
> > > > perf_test_setup_pinning(optarg);
> > > > break;
> > > >
> > > > perf_test_setup_pinning() would:
> > > > - Parse the list and populate perf_test_vcpu_args with each vCPU's
> > > > assigned pCPU.
> > > > - Pin the current thread to it's assigned pCPU if one is provided.
> > > >
> > >
> > > This will assume all tests have the same pinning requirement and
> > > format. What if some tests have more than one worker threads after the
> > > vcpus?
> >
> > Even if a test has other worker threads, this proposal would still be
> > logically consistent. The flag is defined to only control the vCPU
> > threads and the main threads. If some test has some other threads
> > besides that, this flag will not affect them (which is exactly how it's
> > defined to behave). If such a test wants to pin those other threads, it
> > would make sense to have a test-specific flag for that pinning (and we
> > can figure out the right way to do that if/when we encounter that
> > situation).
>
> ...
>
> > Yeah and I also realized perf_test_setup_pinning() will need to know how
> > many vCPUs there are so it can determine which element is the main
> > thread's pCPU assignment.
>
> The "how many workers you got?" conundrum can be solved in the same way, e.g. just
> have the caller pass in the number of workers it will create.
>
> perf_test_setup_pinning(pin_string, nr_vcpus, NR_WORKERS);
>
> The only question is what semantics we should support for workers, e.g. do we
> want to force an all-or-none approach or can the user pin a subset. All-or-none
> seems like it'd be the simplest to maintain and understand. I.e. if -c is used,
> then all vCPUs must be pinned, and either all workers or no workers are pinned.
Combining both of your suggestions to make sure everyone is on the same page:
perf_test_setup_pinning(pin_string, nr_vcpus) API expects (nr_vcpus +
1) entries in "pin_string", where it will use first nr_vcpus entries
for vcpus and the one after that for caller thread cpu.
In future if there is a need for common workers, then after nr_vcpus+1
entries there will be entries for workers, having a predefined order
as workers get added to the code.
"pin_string" must always have AT LEAST (nr_vcpus + 1) entries. After
that if there are more entries, then those will be used to assign cpus
to common worker threads based on predefined order. perf_test_util
should have a way to know how many common workers there are. There
can be more workers than entries, this is fine, those extra workers
will not be pinned.
If this is not desirable, let us hold on discussion when we actually
get common workers, meanwhile, just keep nr_vcpus + 1 entries and work
accordingly.