Re: [PATCH v7] crypto: qce - Add runtime PM and interconnect bandwidth scaling support

From: Bjorn Andersson

Date: Fri Feb 20 2026 - 18:40:01 EST


On Fri, Feb 20, 2026 at 12:58:18PM +0530, quic_utiwari@xxxxxxxxxxx wrote:
> From: Udit Tiwari <quic_utiwari@xxxxxxxxxxx>
>
> The Qualcomm Crypto Engine (QCE) driver currently lacks support for
> runtime power management (PM) and interconnect bandwidth control.
> As a result, the hardware remains fully powered and clocks stay
> enabled even when the device is idle. Additionally, static
> interconnect bandwidth votes are held indefinitely, preventing the
> system from reclaiming unused bandwidth.
>
> Address this by enabling runtime PM and dynamic interconnect
> bandwidth scaling to allow the system to suspend the device when idle
> and scale interconnect usage based on actual demand. Improve overall
> system efficiency by reducing power usage and optimizing interconnect
> resource allocation.
>
> Make the following changes as part of this integration:

"this integration" is internal lingo. In fact I think you can omit this
whole paragraph, because the bullets here are expected.

>
> - Add support for pm_runtime APIs to manage device power state
> transitions.
> - Implement runtime_suspend() and runtime_resume() callbacks to gate
> clocks and vote for interconnect bandwidth only when needed.
> - Replace devm_clk_get_optional_enabled() with devm_pm_clk_create() +
> pm_clk_add() and let the PM core manage device clocks during runtime
> PM and system sleep.
> - Register dev_pm_ops with the platform driver to hook into the PM
> framework.
>
> Tested:

This isn't very useful to carry in the git history, please move this
down below '---'.

>
> - Verify that ICC votes drop to zero after probe and upon request
> completion.
> - Confirm that runtime PM usage count increments during active
> requests and decrements afterward.
> - Observe that the device correctly enters the suspended state when
> idle.
>
> Signed-off-by: Udit Tiwari <quic_utiwari@xxxxxxxxxxx>

Please switch to oss.qualcomm.com

> ---
> Changes in v7:
> - Use ACQUIRE guard in probe to simplify runtime PM management and error paths.
> - Drop redundant icc_enable() call in runtime resume path.
> - Explicitly call pm_clk_suspend(dev) and pm_clk_resume(dev) within the
> custom runtime PM callbacks. Since custom callbacks are provided to handle
> interconnect scaling, the standard PM clock helpers must be invoked manually
> to ensure clocks are gated/ungated.
>
> Changes in v6:
> - Adopt ACQUIRE(pm_runtime_active_try, ...) for scoped runtime PM management
> in qce_handle_queue(). This removes the need for manual put calls and
> goto labels in the error paths, as suggested by Konrad.
> - Link to v6: https://lore.kernel.org/lkml/20260210061437.2293654-1-quic_utiwari@xxxxxxxxxxx/
>
> Changes in v5:
> - Drop Reported-by and Closes tags for kernel test robot W=1 warnings, as
> the issue was fixed within the same patch series.
> - Fix a minor comment indentation/style issue.
> - Link to v5: https://lore.kernel.org/lkml/20251120062443.2016084-1-quic_utiwari@xxxxxxxxxxx/
>
> Changes in v4:
> - Annotate runtime PM callbacks with __maybe_unused to silence W=1 warnings.
> - Add Reported-by and Closes tags for kernel test robot warning.
> - Link to v4: https://lore.kernel.org/lkml/20251117062737.3946074-1-quic_utiwari@xxxxxxxxxxx/
>
> Changes in v3:
> - Switch from manual clock management to PM clock helpers
> (devm_pm_clk_create() + pm_clk_add()); no direct clk_* enable/disable
> in runtime callbacks.
> - Replace pm_runtime_get_sync() with pm_runtime_resume_and_get(); remove
> pm_runtime_put_noidle() on error.
> - Define PM ops using helper macros and reuse runtime callbacks for system
> sleep via pm_runtime_force_suspend()/pm_runtime_force_resume().
> - Link to v2: https://lore.kernel.org/lkml/20250826110917.3383061-1-quic_utiwari@xxxxxxxxxxx/
>
> Changes in v2:
> - Extend suspend/resume support to include runtime PM and ICC scaling.
> - Register dev_pm_ops and implement runtime_suspend/resume callbacks.
> - Link to v1: https://lore.kernel.org/lkml/20250606105808.2119280-1-quic_utiwari@xxxxxxxxxxx/
> ---
> drivers/crypto/qce/core.c | 87 +++++++++++++++++++++++++++++++++------
> 1 file changed, 75 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/crypto/qce/core.c b/drivers/crypto/qce/core.c
> index b966f3365b7d..776a08340b08 100644
> --- a/drivers/crypto/qce/core.c
> +++ b/drivers/crypto/qce/core.c
> @@ -12,6 +12,9 @@
> #include <linux/module.h>
> #include <linux/mod_devicetable.h>
> #include <linux/platform_device.h>
> +#include <linux/pm.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/pm_clock.h>
> #include <linux/types.h>
> #include <crypto/algapi.h>
> #include <crypto/internal/hash.h>
> @@ -90,6 +93,11 @@ static int qce_handle_queue(struct qce_device *qce,
> struct crypto_async_request *async_req, *backlog;
> int ret = 0, err;

First use of ret is now an assignment, so please drop the unnecessary
zero-initialization.

>
> + ACQUIRE(pm_runtime_active_try, pm)(qce->dev);
> + ret = ACQUIRE_ERR(pm_runtime_active_auto_try, &pm);

Luckily, this - to me - incomprehensible construct got some useful
wrappers in ef8057b07c72 ("PM: runtime: Wrapper macros for
ACQUIRE()/ACQUIRE_ERR()"), merged back in November. So, you should
instead use the form:

PM_RUNTIME_ACQUIRE(qce->dev, pm);
if ((ret = PM_RUNTIME_ACQUIRE_ERR(&pm)))
return ret;

or (I don't think we really care about the specific error returned):

PM_RUNTIME_ACQUIRE(qce->dev, pm);
if (PM_RUNTIME_ACQUIRE_ERR(&pm))
return -Esome_specific_error;


Although I presume that's PM_RUNTIME_ACQUIRE_AUTOSUSPEND() in your case.

> + if (ret)
> + return ret;
> +
> scoped_guard(mutex, &qce->lock) {
> if (req)
> ret = crypto_enqueue_request(&qce->queue, req);
> @@ -207,23 +215,34 @@ static int qce_crypto_probe(struct platform_device *pdev)
> if (ret < 0)
> return ret;
>
> - qce->core = devm_clk_get_optional_enabled(qce->dev, "core");
> - if (IS_ERR(qce->core))
> - return PTR_ERR(qce->core);
> + /* PM clock helpers: register device clocks */
> + ret = devm_pm_clk_create(dev);
> + if (ret)
> + return ret;
>
> - qce->iface = devm_clk_get_optional_enabled(qce->dev, "iface");
> - if (IS_ERR(qce->iface))
> - return PTR_ERR(qce->iface);
> + ret = pm_clk_add(dev, "core");
> + if (ret)
> + return ret;
>
> - qce->bus = devm_clk_get_optional_enabled(qce->dev, "bus");
> - if (IS_ERR(qce->bus))
> - return PTR_ERR(qce->bus);
> + ret = pm_clk_add(dev, "iface");
> + if (ret)
> + return ret;
>
> - qce->mem_path = devm_of_icc_get(qce->dev, "memory");
> + ret = pm_clk_add(dev, "bus");
> + if (ret)
> + return ret;
> +
> + qce->mem_path = devm_of_icc_get(dev, "memory");
> if (IS_ERR(qce->mem_path))
> return PTR_ERR(qce->mem_path);
>
> - ret = icc_set_bw(qce->mem_path, QCE_DEFAULT_MEM_BANDWIDTH, QCE_DEFAULT_MEM_BANDWIDTH);
> + /* Enable runtime PM after clocks and ICC are acquired */

It wouldn't hurt to continue this sentence to include that you're doing
it like that so that clocks and interconnect votes are applied by the
resume callback (I assume that's your reason for the ordering at least).

> + ret = devm_pm_runtime_enable(dev);
> + if (ret)
> + return ret;
> +
> + ACQUIRE(pm_runtime_active_try, pm)(dev);
> + ret = ACQUIRE_ERR(pm_runtime_active_auto_try, &pm);

As above.

> if (ret)
> return ret;
>
> @@ -245,9 +264,52 @@ static int qce_crypto_probe(struct platform_device *pdev)
> qce->async_req_enqueue = qce_async_request_enqueue;
> qce->async_req_done = qce_async_request_done;
>
> - return devm_qce_register_algs(qce);
> + ret = devm_qce_register_algs(qce);
> + if (ret)
> + return ret;
> +
> + /* Configure autosuspend after successful init */
> + pm_runtime_set_autosuspend_delay(dev, 100);
> + pm_runtime_use_autosuspend(dev);
> + pm_runtime_mark_last_busy(dev);
> +
> + return 0;
> }
>
> +static int __maybe_unused qce_runtime_suspend(struct device *dev)
> +{
> + struct qce_device *qce = dev_get_drvdata(dev);
> +
> + icc_disable(qce->mem_path);
> +
> + return pm_clk_suspend(dev);
> +}
> +
> +static int __maybe_unused qce_runtime_resume(struct device *dev)
> +{
> + struct qce_device *qce = dev_get_drvdata(dev);
> + int ret = 0;
> +
> + ret = pm_clk_resume(dev);

What is the reason to use pm_clk_add() if you need to manually
enable/disable them anyways?

> + if (ret)
> + return ret;
> +
> + ret = icc_set_bw(qce->mem_path, QCE_DEFAULT_MEM_BANDWIDTH, QCE_DEFAULT_MEM_BANDWIDTH);

icc_disable() will set the "enabled" flag on the internal interconnect
data structures, which causes the vote to be skipped in aggregation.

So, not only does it look unbalanced with icc_disable() vs icc_set_bw()
in suspend/resume, I don't think you have a bandwidth vote after the
first suspend, unless you call icc_enable() here.

> + if (ret)

Just put pm_clk_suspend(dev) here and return ret; below. Skip the goto.

Regards,
Bjorn

> + goto err_icc;
> +
> + return 0;
> +
> +err_icc:
> + pm_clk_suspend(dev);
> + return ret;
> +}
> +
> +static const struct dev_pm_ops qce_crypto_pm_ops = {
> + SET_RUNTIME_PM_OPS(qce_runtime_suspend, qce_runtime_resume, NULL)
> + SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
> +};
> +
> static const struct of_device_id qce_crypto_of_match[] = {
> { .compatible = "qcom,crypto-v5.1", },
> { .compatible = "qcom,crypto-v5.4", },
> @@ -261,6 +323,7 @@ static struct platform_driver qce_crypto_driver = {
> .driver = {
> .name = KBUILD_MODNAME,
> .of_match_table = qce_crypto_of_match,
> + .pm = &qce_crypto_pm_ops,
> },
> };
> module_platform_driver(qce_crypto_driver);
> --
> 2.34.1
>
>