Re: [PATCH 2/3] thermal: core: fix: Initialize the max_state variable to 0

From: Eduardo Valentin
Date: Mon Oct 06 2014 - 14:01:33 EST


On Fri, Oct 03, 2014 at 10:26:28AM +0200, Lukasz Majewski wrote:
> Hi Eduardo,
>
> > Hello Lukasz,
> >
> > On Wed, Sep 24, 2014 at 10:27:11AM +0200, Lukasz Majewski wrote:
> > > Pointer to the uninitialized max_state variable is passed to get the
> > > maximal cooling state.
> > > For CPU cooling device (cpu_cooling.c) the cpufreq_get_max_state()
> > > is called, which even when error occurs will return the max_state
> > > variable unchanged.
> > > Since error for ->get_max_state() is not checked, the automatically
> >
> >
> > Good that you added a fix in your series for this.
> >
> >
> > > allocated value of max_state is used for (upper > max_state)
> > > comparison. For any possible max_state value it is very unlikely
> > > that it will be less than upper.
> > > As a consequence, the cooling device is bind even without the backed
> > > cpufreq table initialized.
> > >
> > > This initialization will prevent from accidental binding trip
> > > points to cpu freq cooling frequencies when cpufreq driver itself
> > > is not yet fully initialized.
> >
> > Although I agree with the fix, as long as we also include a check for
> > the .get_max_state return value, I believe the problem you are
> > describing is about initialization sequence.
>
> As you pointed out - the problem here is with initialization sequence.
> Thermal and cpufreq cores are initialized very early.
>
> However, the get_max_state() for cpu_cooling.c device (as it is the
> pervasive way of cooling things) accesses cpufreq policy to get the
> freq_table and count available states.
>
> The issue here is with late initialization of cpufreq policy.
>
> Up till now the cpu_cooling device was bind even when the
> get_max_state() returned -EINVAL and everything worked after late
> cpufreq policy initialization. However, during this time window the
> thermal driver is not configured.
>
> >
> > In general, I believe we need a better
> > sequencing between thermal and cpufreq subsystems. One way out is to
> > include a check for cpufreq driver in the thermal driver, and return
> > -EPROBE_DEFER when cpufreq is not ready.
>
> I think that we could return -EPROBE_DEFER when cpufreq's policy is not
> yet available and subscribe to cpufreq notifier to call
> bind_cooling_device then.

I agree with this approach, as long as we keep the existing users of
cpufreq cooling in a working state.

Cheers

>
> Let's wait for Zhang opinion since he looks after the thermal core code.
>
> >
> > >
> > > Signed-off-by: Lukasz Majewski <l.majewski@xxxxxxxxxxx>
> > > ---
> > > drivers/thermal/thermal_core.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/thermal/thermal_core.c
> > > b/drivers/thermal/thermal_core.c index 454884a..747618a 100644
> > > --- a/drivers/thermal/thermal_core.c
> > > +++ b/drivers/thermal/thermal_core.c
> > > @@ -927,7 +927,7 @@ int thermal_zone_bind_cooling_device(struct
> > > thermal_zone_device *tz, struct thermal_instance *pos;
> > > struct thermal_zone_device *pos1;
> > > struct thermal_cooling_device *pos2;
> > > - unsigned long max_state;
> > > + unsigned long max_state = 0;
> > > int result;
> > >
> > > if (trip >= tz->trips || (trip < 0 && trip !=
> > > THERMAL_TRIPS_NONE)) --
> > > 2.0.0.rc2
> > >
> >
>
>
>
> --
> Best regards,
>
> Lukasz Majewski
>
> Samsung R&D Institute Poland (SRPOL) | Linux Platform Group
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/