Re: [tip: perf/core] perf/core: Fix endless multiplex timer

From: Robin Murphy
Date: Thu Aug 06 2020 - 16:40:21 EST


On 2020-08-06 19:53, Greg KH wrote:
On Thu, Aug 06, 2020 at 07:11:24PM +0100, Robin Murphy wrote:
On 2020-03-20 12:58, tip-bot2 for Peter Zijlstra wrote:
The following commit has been merged into the perf/core branch of tip:

Commit-ID: 90c91dfb86d0ff545bd329d3ddd72c147e2ae198
Gitweb: https://git.kernel.org/tip/90c91dfb86d0ff545bd329d3ddd72c147e2ae198
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
AuthorDate: Thu, 05 Mar 2020 13:38:51 +01:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Fri, 20 Mar 2020 13:06:22 +01:00

perf/core: Fix endless multiplex timer

Kan and Andi reported that we fail to kill rotation when the flexible
events go empty, but the context does not. XXX moar

Fixes: fd7d55172d1e ("perf/cgroups: Don't rotate events for cgroups unnecessarily")

Can this patch (commit 90c91dfb86d0 ("perf/core: Fix endless multiplex
timer") upstream) be applied to stable please? For PMU drivers built as
modules, the bug can actually kill the system, since the runaway hrtimer
loop keeps calling pmu->{enable,disable} after all the events have been
closed and dropped their references to pmu->module. Thus legitimately
unloading the module once things have got into this state quickly results in
a crash when those callbacks disappear.

(FWIW I spent about two days fighting with this while testing a new driver
as a module against the 5.3 kernel installed on someone else's machine,
assuming it was a bug in my code...)

What exactly kernel(s) do you wish for it to be applied to? It's
already in the latest stable releases of 5.7.y.

Sorry, I implicitly meant 5.4.y there - the buggy commit was merged in 5.3, the fix in 5.7, so I think that's the only "stable" branch in between that warrants explicit action. Apologies if I'm getting the terminology wrong.

Cheers,
Robin.