Re: [PATCH v2 08/17] PM: EM: Introduce runtime modifiable table

From: Dietmar Eggemann
Date: Tue May 30 2023 - 06:18:30 EST


On 12/05/2023 11:57, Lukasz Luba wrote:
> This patch introduces the new feature: modifiable EM perf_state table.
> The new runtime table would be populated with a new power data to better
> reflect the actual power. The power can vary over time e.g. due to the
> SoC temperature change. Higher temperature can increase power values.
> For longer running scenarios, such as game or camera, when also other
> devices are used (e.g. GPU, ISP) the CPU power can change. The new
> EM framework is able to addresses this issue and change the data
> at runtime safely. The runtime modifiable EM data is used by the Energy
> Aware Scheduler (EAS) for the task placement.

It's important to say that EAS is the _only_user of the `runtime
modifiable EM`. All the other users (thermal, etc.) are still using the
default (basic) EM. IMHO, this fact drove the design here.

> Signed-off-by: Lukasz Luba <lukasz.luba@xxxxxxx>
> ---
> include/linux/energy_model.h | 13 +++++++++++++
> kernel/power/energy_model.c | 24 ++++++++++++++++++++++++
> 2 files changed, 37 insertions(+)
>
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index cc2bf607191e..a616006a8130 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -36,9 +36,21 @@ struct em_perf_state {
> */
> #define EM_PERF_STATE_INEFFICIENT BIT(0)
>
> +/**
> + * struct em_perf_table - Performance states table, which can be
> + * runtime modifiable and protected with RCU

which is `runtime modifiable` ? So `runtime modifiable performance state
table`? RCU is obvious since we have `struct rcu_head rcu`.

> + * @state: List of performance states, in ascending order
> + * @rcu: RCU used for safe access and destruction
> + */
> +struct em_perf_table {
> + struct em_perf_state *state;
> + struct rcu_head rcu;
> +};
> +
> /**
> * struct em_perf_domain - Performance domain
> * @table: List of performance states, in ascending order
> + * @runtime_table: Pointer to the runtime modified em_perf_table

s/modified/modifiable

[...]

> @@ -237,12 +238,23 @@ static int em_create_pd(struct device *dev, int nr_states,
> return -ENOMEM;
> }
>
> + runtime_table = kzalloc(sizeof(*runtime_table), GFP_KERNEL);
> + if (!runtime_table) {
> + kfree(pd);
> + return -ENOMEM;
> + }
> +
> ret = em_create_perf_table(dev, pd, nr_states, cb, flags);
> if (ret) {
> kfree(pd);
> + kfree(runtime_table);
> return ret;
> }
>
> + /* Re-use temporally (till 1st modification) the memory */

So this means that the runtime (modifiable) table
(pd->runtime_table>state) is mapped to the default (basic) table
(pd->default_table->state) until the first call to
em_dev_update_perf_domain() (here mentioned as the 1st modification)?

IMHO, not easy to understand since neither the cover letter, nor
documentation patch 15/17 describes this in a consistent story.

[...]