[PATCH v2 00/23] interconnect: fix racy provider registration

From: Johan Hovold
Date: Mon Mar 06 2023 - 02:57:51 EST


The current interconnect provider interface is inherently racy as
providers are expected to be registered before being fully initialised.

This can specifically cause racing DT lookups to fail as I recently
noticed when the Qualcomm cpufreq driver failed to probe:

of_icc_xlate_onecell: invalid index 0
cpu cpu0: error -EINVAL: error finding src node
cpu cpu0: dev_pm_opp_of_find_icc_paths: Unable to get path0: -22
qcom-cpufreq-hw: probe of 18591000.cpufreq failed with error -22

This only happens very rarely, but the bug is easily reproduced by
increasing the race window by adding an msleep() after registering
osm-l3 interconnect provider.

Note that the Qualcomm cpufreq driver is especially susceptible to this
race as the interconnect path is looked up from the CPU nodes so that
driver core does not guarantee the probe order even when device links
are enabled (which they not always are).

This series adds a new interconnect provider registration API which is
used to fix up the interconnect drivers before removing the old racy
API.

Included are also a number of fixes for other bugs found while preparing
the series.

Johan


Changes in v2
- icc_node_destroy() can be called with an arbitrary node id so add the
missing sanity check to handle potential attempts to destroy a
non-existing node (patch 01/23). Reported by Dan Carpenter and the
kernel test robot:

https://lore.kernel.org/oe-kbuild/202302222118.nGz1F0oJ-lkp@xxxxxxxxx/


Johan Hovold (23):
interconnect: fix mem leak when freeing nodes
interconnect: fix icc_provider_del() error handling
interconnect: fix provider registration API
interconnect: imx: fix registration race
interconnect: qcom: osm-l3: fix registration race
interconnect: qcom: rpm: fix probe child-node error handling
interconnect: qcom: rpm: fix probe PM domain error handling
interconnect: qcom: rpm: fix registration race
interconnect: qcom: rpmh: fix probe child-node error handling
interconnect: qcom: rpmh: fix registration race
interconnect: qcom: msm8974: fix registration race
interconnect: qcom: sm8450: fix registration race
interconnect: qcom: sm8550: fix registration race
interconnect: exynos: fix node leak in probe PM QoS error path
interconnect: exynos: fix registration race
interconnect: exynos: drop redundant link destroy
memory: tegra: fix interconnect registration race
memory: tegra124-emc: fix interconnect registration race
memory: tegra20-emc: fix interconnect registration race
memory: tegra30-emc: fix interconnect registration race
interconnect: drop racy registration API
interconnect: drop unused icc_get() interface
interconnect: drop unused icc_link_destroy() interface

drivers/interconnect/core.c | 152 +++++---------------------
drivers/interconnect/imx/imx.c | 20 ++--
drivers/interconnect/qcom/icc-rpm.c | 33 +++---
drivers/interconnect/qcom/icc-rpmh.c | 30 +++--
drivers/interconnect/qcom/msm8974.c | 20 ++--
drivers/interconnect/qcom/osm-l3.c | 14 +--
drivers/interconnect/qcom/sm8450.c | 22 ++--
drivers/interconnect/qcom/sm8550.c | 22 ++--
drivers/interconnect/samsung/exynos.c | 30 ++---
drivers/memory/tegra/mc.c | 16 ++-
drivers/memory/tegra/tegra124-emc.c | 12 +-
drivers/memory/tegra/tegra20-emc.c | 12 +-
drivers/memory/tegra/tegra30-emc.c | 12 +-
include/linux/interconnect-provider.h | 19 ++--
include/linux/interconnect.h | 8 --
15 files changed, 157 insertions(+), 265 deletions(-)

--
2.39.2