Re: assert from intel_pmu_hw_config

From: Liang, Kan
Date: Mon Oct 17 2022 - 10:27:01 EST




On 2022-10-14 6:15 p.m., Peter Zijlstra wrote:
> Hi Kan,
>
> While fuzzing on my ADL, I saw a splat (sadly not captured because
> MeshCommander is a pain in the backside).
>
> The thing I did recover was that it was the new
> lockdep_assert_event_ctx() triggering from intel_pmu_hw_config().
>
> Now; that code reads:
>
> if (require_mem_loads_aux_event(event) &&
> (event->attr.sample_type & PERF_SAMPLE_DATA_SRC) &&
> is_mem_loads_event(event)) {
> struct perf_event *leader = event->group_leader;
> struct perf_event *sibling = NULL;
>
> if (!is_mem_loads_aux_event(leader)) {
> for_each_sibling_event(sibling, leader) {
> if (is_mem_loads_aux_event(sibling))
> break;
> }
> if (list_entry_is_head(sibling, &leader->sibling_list, sibling_list))
> return -ENODATA;
> }
> }
>
> And it is trying to assert leader->ctx->mutex is held.
>
> Now, the calling context perf_try_init_event() has:
>
> /*
> * A number of pmu->event_init() methods iterate the sibling_list to,
> * for example, validate if the group fits on the PMU. Therefore,
> * if this is a sibling event, acquire the ctx->mutex to protect
> * the sibling_list.
> */
> if (event->group_leader != event && pmu->task_ctx_nr != perf_sw_context) {
> /*
> * This ctx->mutex can nest when we're called through
> * inheritance. See the perf_event_ctx_lock_nested() comment.
> */
> ctx = perf_event_ctx_lock_nested(event->group_leader,
> SINGLE_DEPTH_NESTING);
> BUG_ON(!ctx);
> }
>
> IOW; we only hold leader->ctx->mutex when event is *NOT* the group
> leader; while the above code *can* in fact use for_each_sibilng_event()
> on the group leader when conditions are just right.
>
> Now, it's really late and my brain has long since started the weekend,
> but I think something like the below ought to fix things.
>
> Does that make sense? IIRC this would not destroy the purpose of this
> code -- although admittedly, the comment there tickles only vague
> memories.
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index d8af75466ee9..450463d36450 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3975,7 +3975,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
> struct perf_event *leader = event->group_leader;
> struct perf_event *sibling = NULL;
>
> - if (!is_mem_loads_aux_event(leader)) {
> + if (event != leader && !is_mem_loads_aux_event(leader)) {
> for_each_sibling_event(sibling, leader) {
> if (is_mem_loads_aux_event(sibling))
> break;

If the leader is the load latency event, I think we can error out
directly. Because the auxiliary event never can be in front of the load
latency event. Does the below patch help?

struct perf_event *leader = event->group_leader;
struct perf_event *sibling = NULL;

+ if (event == leader)
+ return -ENODATA;
+
if (!is_mem_loads_aux_event(leader)) {
for_each_sibling_event(sibling, leader) {
if (is_mem_loads_aux_event(sibling))
break;
}

Thanks,
Kan