Re: [PATCH v5 RESEND] arch/powerpc/kvm: Add support for reading VPA counters for pseries guests

From: Naveen N Rao
Date: Thu Apr 25 2024 - 07:27:35 EST


On Wed, Apr 24, 2024 at 11:08:38AM +0530, Gautam Menghani wrote:
> On Mon, Apr 22, 2024 at 09:15:02PM +0530, Naveen N Rao wrote:
> > On Tue, Apr 02, 2024 at 12:36:54PM +0530, Gautam Menghani wrote:
> > > static int kvmhv_vcpu_entry_nestedv2(struct kvm_vcpu *vcpu, u64
> > > time_limit,
> > > unsigned long lpcr, u64 *tb)
> > > {
> > > @@ -4130,6 +4161,11 @@ static int kvmhv_vcpu_entry_nestedv2(struct kvm_vcpu *vcpu, u64 time_limit,
> > > kvmppc_gse_put_u64(io->vcpu_run_input, KVMPPC_GSID_LPCR, lpcr);
> > >
> > > accumulate_time(vcpu, &vcpu->arch.in_guest);
> > > +
> > > + /* Enable the guest host context switch time tracking */
> > > + if (unlikely(trace_kvmppc_vcpu_exit_cs_time_enabled()))
> > > + kvmhv_set_l2_accumul(1);
> > > +
> > > rc = plpar_guest_run_vcpu(0, vcpu->kvm->arch.lpid, vcpu->vcpu_id,
> > > &trap, &i);
> > >
> > > @@ -4156,6 +4192,10 @@ static int kvmhv_vcpu_entry_nestedv2(struct kvm_vcpu *vcpu, u64 time_limit,
> > >
> > > timer_rearm_host_dec(*tb);
> > >
> > > + /* Record context switch and guest_run_time data */
> > > + if (kvmhv_get_l2_accumul())
> > > + do_trace_nested_cs_time(vcpu);
> > > +
> > > return trap;
> > > }
> >
> > I'm assuming the counters in VPA are cumulative, since you are zero'ing
> > them out on exit. If so, I think a better way to implement this is to
> > use TRACE_EVENT_FN() and provide tracepoint registration and
> > unregistration functions. You can then enable the counters once during
> > registration and avoid repeated writes to the VPA area. With that, you
> > also won't need to do anything before vcpu entry. If you maintain
> > previous values, you can calculate the delta and emit the trace on vcpu
> > exit. The values in VPA area can then serve as the cumulative values.
> >
>
> This approach will have a problem. The context switch times are reported
> in the L1 LPAR's CPU's VPA area. Consider the following scenario:
>
> 1. L1 has 2 cpus, and L2 has 1 cpu
> 2. L2 runs on L1's cpu0 for a few seconds, and the counter values go to
> 1 million
> 3. We are maintaining a copy of values of VPA in separate variables, so
> those variables also have 1 million.
> 4. Now if L2's vcpu is migrated to another L1 cpu, that L1 cpu's VPA
> counters will start from 0, so if we try to get delta value, we will end
> up doing 0 - 1 million, which would be wrong.

I'm assuming you mean migrating the task. If we maintain the previous
readings in paca, it should work I think.

>
> The aggregation logic in this patch works as we zero out the VPA after
> every switch, and maintain aggregation in a vcpu->arch

Are the cumulative values of the VPA counters of no significance? We
lose those with this approach. Not sure if we care.


- Naveen