Re: [PATCH v7 10/42] clk: davinci: New driver for davinci PSC clocks

From: Bartosz Golaszewski
Date: Fri Mar 02 2018 - 12:39:53 EST


2018-03-01 17:44 GMT+01:00 David Lechner <david@xxxxxxxxxxxxxx>:
> On 03/01/2018 02:36 AM, Bartosz Golaszewski wrote:
>>
>> 2018-02-28 22:40 GMT+01:00 David Lechner <david@xxxxxxxxxxxxxx>:
>>>
>>> On 02/28/2018 06:38 AM, Bartosz Golaszewski wrote:
>>>>
>>>>
>>>>
>>>> I think I found the reason for the strange crashes we were
>>>> experiencing (emac core->name being NULL) thanks to Sekhar who pointed
>>>> me in the right direction.
>>>>
>>>> The mdio driver fails to probe with v7 due to the supplied clock rate
>>>> being wrong. Before failing we register the emac clock with
>>>> pm_clk_add_clk(). When clock_ops puts the clock, it decreases the
>>>> reference count of the clock, but we never actually increased it in
>>>> the first place in the line above. The core clock code then destroys
>>>> the associated clk_core structure. When the next user comes around (in
>>>> our case the clk debug functions) the system crashes.
>>>>
>>>> I believe there to be two issues: one is with v7 - we need to increase
>>>> the clock reference count in davinci_psc_genpd_attach_dev().
>>>>
>>>> Second is the error path in the clock framework - we should remove the
>>>> destroyed clk_core from the debug list, which is not being done now.
>>>>
>>>> Why we even need to track the refcount of clk_core is a mistery for me
>>>> though. Stephen, Mike?
>>>>
>>>> Best regards,
>>>> Bartosz Golaszewski
>>>
>>>
>>>
>>> Great find. I figured it had to be something like this, but I wasn't
>>> able to reproduce the problem yet.
>>>
>>> I suppose it is time to spin up a v8 with some fixes.
>>
>>
>> I still don't know why the mdio clock rate is much lower than in
>> mainline though. Any ideas?
>>
>> Thanks,
>> Bart
>>
>
> Now that you have fixed the crash, can you answer the questions I have
> asked earlier?
>
>> Can you post the output of this command so that I can see how your
>
> clocks are setup:
>
> cat /sys/kernel/debug/clk/clk_summary
>
>> Using your workaround, can you run:
>
>
> cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
>
> If you see:
> 1e27000.clock-controller: emac off-0
>
> then genpd is not working like it is supposed to. You should see something
> like this for device that are working:
> 1e27000.clock-controller: uart2 on
> /devices/platform/soc@1c00000/1d0d000.serial active

I used of_clk_get() in the genpd attach callback so the crash no
longer happens, but I still can't boot it over NFS due to mdio
failing. Do you have any idea why the clock rate differs between v7
and mainline?

>From the logs I can see that genpd domains are correctly registered,
and the provider is added (you should probably skip setting up the
domains in legacy mode though), the pm clocks are enabled (after being
disabled by mdio after its failed probe()) but the boot process gets
stuck after the kernel gets an IP address over DHCP (which is strange
because apparently it had some kind of network connection).

On Monday I'll prepare a small ramfs and boot over tftp only and see from there.

Best regards,
Bartosz