Re: [PATCH v9 04/10] perf/x86: Keep LBR stack unchanged on the host for guest LBR event

From: Peter Zijlstra
Date: Thu Apr 09 2020 - 12:46:04 EST


On Fri, Mar 13, 2020 at 10:16:10AM +0800, Like Xu wrote:
> When a guest wants to use the LBR stack, its hypervisor creates a guest
> LBR event and let host perf schedules it. A new 'int guest_lbr_enabled'
> field in the "struct cpu_hw_events", is marked as true when perf adds
> a guest LBR event and false on deletion.
>
> The LBR stack msrs are accessible to the guest when its guest LBR event
> is scheduled in by the perf subsystem. Before scheduling out the event,
> we should avoid host changes on IA32_DEBUGCTLMSR or LBR_SELECT. Otherwise,
> some unexpected branch operations may interfere with guest behavior,
> pollute LBR records, and even cause host branch data leakage. In addition,
> the intel_pmu_lbr_read() on the host is also avoidable for guest usage.
>
> On v4 PMU or later, the LBR stack are frozen on the overflowed condition
> if Freeze_LBR_On_PMI is true and resume recording via acking LBRS_FROZEN
> to global status msr instead of re-enabling IA32_DEBUGCTL.LBR. So when a
> guest LBR event is running, the host PMI handler has to keep LBRS_FROZEN
> bit set (thus LBR being frozen) until the guest enables it. Otherwise,
> when the guest enters non-root mode, the LBR will start recording and
> the guest PMI handler code will also pollute the LBR stack.
>
> To ensure that guest LBR records are not lost during the context switch,
> the BRANCH_CALL_STACK flag should be configured in the 'branch_sample_type'
> for a guest LBR event because a callstack event could save/restore guest
> unread records with the help of intel_pmu_lbr_sched_task() naturally.
>
> However, the regular host LBR perf event doesn't save/restore LBR_SELECT,
> because it's configured in the LBR_enable() based on branch_sample_type.
> So when a guest LBR is running, the guest LBR_SELECT may changes for its
> own use and we have to add the LBR_SELECT save/restore to ensure what the
> guest LBR_SELECT value doesn't get lost during the context switching.

I had to read the patch before that made sense; I think it's mostly
there, but it can use a little help.


> @@ -691,8 +714,12 @@ void intel_pmu_lbr_read(void)
> *
> * This could be smarter and actually check the event,
> * but this simple approach seems to work for now.
> + *
> + * And there is no need to read lbr here if a guest LBR event

There's 'lbr' and 'LBR' in the same sentence

> + * is using it, because the guest will read them on its own.
> */
> - if (!cpuc->lbr_users || cpuc->lbr_users == cpuc->lbr_pebs_users)
> + if (!cpuc->lbr_users || cpuc->guest_lbr_enabled ||
> + cpuc->lbr_users == cpuc->lbr_pebs_users)

indent fail

> return;
>
> if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_32)