Re: [PATCH 0/9] perf: Avoid explicit cpumask var allocation from stack
From: Mark Rutland
Date: Tue Apr 02 2024 - 07:13:04 EST
On Tue, Apr 02, 2024 at 06:56:01PM +0800, Dawei Li wrote:
> Hi,
>
> This series try to eliminate direct cpumask var allocation from stack
> for perf subsystem.
>
> Direct/explicit allocation of cpumask on stack could be dangerous since
> it can lead to stack overflow for systems with big NR_CPUS or
> CONFIG_CPUMASK_OFFSTACK=y.
>
> For arm64, it's more urgent since commit 3fbd56f0e7c1 ("ARM64: Dynamically
> allocate cpumasks and increase supported CPUs to 512").
>
> It's sort of a pattern that almost every cpumask var in perf subystem
> occurs in teardown callback of cpuhp. In which case, if dynamic
> allocation failed(which is unlikely), we choose return 0 rather than
> -ENOMEM to caller cuz:
> @teardown is not supposed to fail and if it does, system crashes:
.. but we've left the system in an incorrect state, so that makes no sense.
As I commented on the first patch, NAK to dynamically allocating cpumasks in
the CPUHP callbacks. Please allocate the necessry cpumasks up-front when we
probe the PMU. At that time we can handle an allocation failure by cleaning up
and failing to probe the PMU, and then the CPUHP callbacks don't need to
allocate memory to offline a CPU...
Also, for the titles it'd be better to say something like "avoid placing
cpumasks on the stack", because "explicit cpumask var allocation" sounds like
the use of alloc_cpumask_var().
Mark.
>
> static int cpuhp_issue_call(int cpu, enum cpuhp_state state, bool bringup,
> struct hlist_node *node)
> {
> struct cpuhp_step *sp = cpuhp_get_step(state);
> int ret;
>
> /*
> * If there's nothing to do, we done.
> * Relies on the union for multi_instance.
> */
> if (cpuhp_step_empty(bringup, sp))
> return 0;
> /*
> * The non AP bound callbacks can fail on bringup. On teardown
> * e.g. module removal we crash for now.
> */
> #ifdef CONFIG_SMP
> if (cpuhp_is_ap_state(state))
> ret = cpuhp_invoke_ap_callback(cpu, state, bringup, node);
> else
> ret = cpuhp_invoke_callback(cpu, state, bringup, node,
> NULL);
> #else
> ret = cpuhp_invoke_callback(cpu, state, bringup, node, NULL);
> #endif
> BUG_ON(ret && !bringup);
> return ret;
> }
>
> Dawei Li (9):
> perf/alibaba_uncore_drw: Avoid explicit cpumask var allocation from
> stack
> perf/arm-cmn: Avoid explicit cpumask var allocation from stack
> perf/arm_cspmu: Avoid explicit cpumask var allocation from stack
> perf/arm_dsu: Avoid explicit cpumask var allocation from stack
> perf/dwc_pcie: Avoid explicit cpumask var allocation from stack
> perf/hisi_pcie: Avoid explicit cpumask var allocation from stack
> perf/hisi_uncore: Avoid explicit cpumask var allocation from stack
> perf/qcom_l2: Avoid explicit cpumask var allocation from stack
> perf/thunder_x2: Avoid explicit cpumask var allocation from stack
>
> drivers/perf/alibaba_uncore_drw_pmu.c | 13 +++++++++----
> drivers/perf/arm-cmn.c | 13 +++++++++----
> drivers/perf/arm_cspmu/arm_cspmu.c | 13 +++++++++----
> drivers/perf/arm_dsu_pmu.c | 18 +++++++++++++-----
> drivers/perf/dwc_pcie_pmu.c | 17 +++++++++++------
> drivers/perf/hisilicon/hisi_pcie_pmu.c | 15 ++++++++++-----
> drivers/perf/hisilicon/hisi_uncore_pmu.c | 13 +++++++++----
> drivers/perf/qcom_l2_pmu.c | 15 ++++++++++-----
> drivers/perf/thunderx2_pmu.c | 20 ++++++++++++--------
> 9 files changed, 92 insertions(+), 45 deletions(-)
>
>
> Thanks,
>
> Dawei
>
> --
> 2.27.0
>