Re: [PATCH] PM: EM: Fix potential division-by-zero error in em_compute_costs()

From: Yaxiong Tian
Date: Thu Apr 10 2025 - 21:21:45 EST




在 2025/4/10 21:23, Rafael J. Wysocki 写道:
On Thu, Apr 10, 2025 at 7:39 AM Yaxiong Tian <iambestgod@xxxxxx> wrote:

From: Yaxiong Tian <tianyaxiong@xxxxxxxxxx>

When the device is of a non-CPU type, table[i].performance won't be
initialized in the previous em_init_performance(), resulting in division
by zero when calculating costs in em_compute_costs().

Considering that the performance field in struct em_perf_state is defined
as "CPU performance (capacity) at a given frequency", the original
calculation method should be maintained when the device is of a non-CPU
type.

Fixes: <1b600da51073> ("PM: EM: Optimize em_cpu_energy() and remove division")

Signed-off-by: Yaxiong Tian <tianyaxiong@xxxxxxxxxx>
---
kernel/power/energy_model.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
index d9b7e2b38c7a..bbd95573d91e 100644
--- a/kernel/power/energy_model.c
+++ b/kernel/power/energy_model.c
@@ -231,9 +231,11 @@ static int em_compute_costs(struct device *dev, struct em_perf_state *table,
unsigned long flags)
{
unsigned long prev_cost = ULONG_MAX;
+ u64 fmax;

Why not initialize it here? Also please retain the reverse x-mas tree
ordering of declarations.

There is indeed an issue with imperfect code style here.

int i, ret;

/* Compute the cost of each performance state. */
+ fmax = (u64) table[nr_states - 1].frequency;

No need to cast to u64 explicitly (it will be cast anyway).

for (i = nr_states - 1; i >= 0; i--) {
unsigned long power_res, cost;

@@ -245,9 +247,15 @@ static int em_compute_costs(struct device *dev, struct em_perf_state *table,
return -EINVAL;
}
} else {
- /* increase resolution of 'cost' precision */
- power_res = table[i].power * 10;
- cost = power_res / table[i].performance;
+ if ( ) {
+ /* increase resolution of 'cost' precision */
+ power_res = table[i].power * 10;
+ cost = power_res / table[i].performance;
+ } else {
+ power_res = table[i].power;
+ cost = div64_u64(fmax * power_res, table[i].frequency);

Why is it necessary to compute the "cost" field value for non-CPU
devices at all?

Indeed, I didn't think of this issue—I was focused on ensuring the rest
of the code was correct and forgot that the 'cost' algorithm was only
for EAS energy efficiency calculations. I checked other parts of the
code, and 'cost' isn't used elsewhere. Therefore, this bug can be fixed
simply by adding a _is_cpu_device(dev) check. I'll make the change in
the next version and update the commit message accordingly.

+

An excess empty line.

+ }
}

table[i].cost = cost;
--