Re: [PATCH 02/31] OPP: Add dev_pm_opp_set_config() and friends
From: Viresh Kumar
Date: Tue Jun 21 2022 - 11:34:47 EST
Hi Dmitry,
On 21-06-22, 18:09, Dmitry Osipenko wrote:
> 1. I started to look at the Tegra regressions caused by these OPP
> patches and this one looks wrong to me because dev_pm_opp_set_config()
> could be invoked multiple times by different drivers for the same device
> and then you're putting table not in accordance to the config that was
> used by a particular driver.
>
> For example, if parent tegra-cpufreq driver sets supported_hw for
> cpu_dev and then cpufreq-dt also does dev_pm_opp_set_config(cpu_dev),
> then dev_pm_opp_clear_config(cpu_dev) of cpufreq-dt will put
> supported_hw(cpu_dev) of tegra-cpufreq. Hence this
> dev_pm_opp_set/clear_config() approach isn't viable, unless I'm missing
> something.
Yeah, I know that and I didn't put a lot of effort into it because of multiple
reasons:
- That is partially the existing behavior. For example, we can call the
set-supported-hw interface right now for each CPU of a policy, which many
drivers do right now btw, and then while putting them back we drop the
resource on the first call itself and not on the last CPU.
- Yes, with the new patchset we will drop the resources even for an unrelated
resource call, I will think again about it though and maybe add a flag field
to notify which all resources to clean, but even in the current case it should
be fine as we won't be able to use a half initialized OPP table anyway (which
may actually be harmful). What I mean is, if you set regulators and
supported-hw in the beginning, then on un-init, we won't want to work with
only one of them in place. We always want all of them.
> 2. Patches aren't bisectable, please make sure that all patches compile
> individually and without warnings.
That is strange. I will try build over each and every patch (again). Also I
think the kernel bots (from LKP) test individual patches and I haven't got any
failure messages yet. Which patch broke the build for you ?
> 3. There is a new NULL dereference in the recent linux-next on Tegra in
> _set_opp() of the gpu/host1x driver. I'll take a closer look at this
> crash a bit later.
I just fixed a bug for devices which don't have the clock property, but just
level or bandwidth. Not sure if that is the one that caused trouble for you. It
is pushed to opp/linux-next.
--
viresh