Re: [RFC PATCH 02/41] perf: Support guest enter/exit interfaces

From: Zhang, Xiong Y
Date: Tue Apr 16 2024 - 01:34:40 EST




On 4/16/2024 12:03 AM, Liang, Kan wrote:
>
>
> On 2024-04-12 4:56 p.m., Liang, Kan wrote:
>>> What if perf had a global knob to enable/disable mediate PMU support? Then when
>>> KVM is loaded with enable_mediated_true, call into perf to (a) check that there
>>> are no existing !exclude_guest events (this part could be optional), and (b) set
>>> the global knob to reject all new !exclude_guest events (for the core PMU?).
>>>
>>> Hmm, or probably better, do it at VM creation. That has the advantage of playing
>>> nice with CONFIG_KVM=y (perf could reject the enabling without completely breaking
>>> KVM), and not causing problems if KVM is auto-probed but the user doesn't actually
>>> want to run VMs.
>> I think it should be doable, and may simplify the perf implementation.
>> (The check in the schedule stage should not be necessary anymore.)
>>
>> With this, something like NMI watchdog should fail the VM creation. The
>> user should either disable the NMI watchdog or use a replacement.
>>
>> Thanks,
>> Kan
>>> E.g. (very roughly)
>>>
>>> int x86_perf_get_mediated_pmu(void)
>>> {
>>> if (refcount_inc_not_zero(...))
>>> return 0;
>>>
>>> if (<system wide events>)
>>> return -EBUSY;
>>>
>>> <slow path with locking>
>>> }
>>>
>>> void x86_perf_put_mediated_pmu(void)
>>> {
>>> if (!refcount_dec_and_test(...))
>>> return;
>>>
>>> <slow path with locking>
>>> }
>
>
> I think the locking should include the refcount check and system wide
> event check as well.
> It should be possible that two VMs are created very close.
> The second creation may mistakenly return 0 if there is no lock.
>
> I plan to do something as below (not test yet).
>
> +/*
> + * Currently invoked at VM creation to
> + * - Check whether there are existing !exclude_guest system wide events
> + * of PMU with PERF_PMU_CAP_MEDIATED_VPMU
> + * - Set nr_mediated_pmu to prevent !exclude_guest event creation on
> + * PMUs with PERF_PMU_CAP_MEDIATED_VPMU
> + *
> + * No impact for the PMU without PERF_PMU_CAP_MEDIATED_VPMU. The perf
> + * still owns all the PMU resources.
> + */
> +int x86_perf_get_mediated_pmu(void)
> +{
> + int ret = 0;
> + mutex_lock(&perf_mediated_pmu_mutex);
> + if (refcount_inc_not_zero(&nr_mediated_pmu_vms))
> + goto end;
> +
> + if (atomic_read(&nr_include_guest_events)) {
> + ret = -EBUSY;
> + goto end;
> + }
> + refcount_inc(&nr_mediated_pmu_vms);
> +end:
> + mutex_unlock(&perf_mediated_pmu_mutex);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(x86_perf_get_mediated_pmu);
> +
> +void x86_perf_put_mediated_pmu(void)
> +{
> + mutex_lock(&perf_mediated_pmu_mutex);
> + refcount_dec(&nr_mediated_pmu_vms);
> + mutex_unlock(&perf_mediated_pmu_mutex);
> +}
> +EXPORT_SYMBOL_GPL(x86_perf_put_mediated_pmu);
>
>
> Thanks,
> Kan
x86_perf_get_mediated_pmu() is called at vm_create(), x86_perf_put_mediated_pmu() is called at vm_destroy(), then system wide perf events without exclude_guest=1 can not be created during the whole vm life cycle (where nr_mediated_pmu_vms > 0 always), do I understand and use the interface correctly ?

thanks