Re: [PATCH v2] PM: EM: Fix potential division-by-zero error in em_compute_costs()
From: Rafael J. Wysocki
Date: Tue Apr 15 2025 - 13:20:06 EST
On Tue, Apr 15, 2025 at 4:03 AM Yaxiong Tian <iambestgod@xxxxxx> wrote:
>
>
>
> 在 2025/4/15 09:12, Yaxiong Tian 写道:
> >
> >
> > 在 2025/4/14 16:08, Lukasz Luba 写道:
> >> Hi Yaxiong,
> >>
> >> On 4/11/25 02:28, Yaxiong Tian wrote:
> >>> From: Yaxiong Tian <tianyaxiong@xxxxxxxxxx>
> >>>
> >>> When the device is of a non-CPU type, table[i].performance won't be
> >>> initialized in the previous em_init_performance(), resulting in division
> >>> by zero when calculating costs in em_compute_costs().
> >>>
> >>> Since the 'cost' algorithm is only used for EAS energy efficiency
> >>> calculations and is currently not utilized by other device drivers, we
> >>> should add the _is_cpu_device(dev) check to prevent this
> >>> division-by-zero
> >>> issue.
> >>>
> >>> Fixes: <1b600da51073> ("PM: EM: Optimize em_cpu_energy() and remove
> >>> division")
> >>> Signed-off-by: Yaxiong Tian <tianyaxiong@xxxxxxxxxx>
> >>> ---
> >>> kernel/power/energy_model.c | 2 +-
> >>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
> >>> index d9b7e2b38c7a..d1fa7e8787b5 100644
> >>> --- a/kernel/power/energy_model.c
> >>> +++ b/kernel/power/energy_model.c
> >>> @@ -244,7 +244,7 @@ static int em_compute_costs(struct device *dev,
> >>> struct em_perf_state *table,
> >>> cost, ret);
> >>> return -EINVAL;
> >>> }
> >>> - } else {
> >>> + } else if (_is_cpu_device(dev)) {
> >>> /* increase resolution of 'cost' precision */
> >>> power_res = table[i].power * 10;
> >>> cost = power_res / table[i].performance;
> >>
> >>
> >> As the test robot pointed out, please set the 'cost' to 0
> >> where it's declared.
> >>
> >> The rest should be fine.
> >>
> >> Regards,
> >> Lukasz
> >
> > Sorry, the V3 version with cost=0 still has issues.
> >
> > I noticed that if the cost is set to 0, the condition "if (table[i].cost
> > >= prev_cost)" in the following code will always evaluate to true. This
> > will incorrectly set the flags to EM_PERF_STATE_INEFFICIENT.
> >
> > Should we change ">=" to ">"?
> >
>
> Sorry Again, Setting EM_PERF_STATE_INEFFICIENT in this case is correct.
> Earlier, I misunderstood the definition/usage of EM_PERF_STATE_INEFFICIENT.
Well, EM_PERF_STATE_INEFFICIENT is only looked at in CPU energy
models, so setting it in a non-CPU one is redundant.