Re: [PATCH] Revert "of: property: fw_devlink: Add support for "phy-handle" property"

From: Florian Fainelli
Date: Fri Sep 17 2021 - 13:32:02 EST


On 9/17/21 10:21 AM, Saravana Kannan wrote:
> On Fri, Sep 17, 2021 at 9:59 AM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
>>
>> On 9/16/21 7:27 PM, Saravana Kannan wrote:
>>> On Thu, Sep 16, 2021 at 7:21 PM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
>>>>
>>>>
>>>>
>>>> On 9/15/2021 1:19 AM, Saravana Kannan wrote:
>>>>> This reverts commit cf4b94c8530d14017fbddae26aad064ddc42edd4.
>>>>>
>>>>> Some PHYs pointed to by "phy-handle" will never bind to a driver until a
>>>>> consumer attaches to it. And when the consumer attaches to it, they get
>>>>> forcefully bound to a generic PHY driver. In such cases, parsing the
>>>>> phy-handle property and creating a device link will prevent the consumer
>>>>> from ever probing. We don't want that. So revert support for
>>>>> "phy-handle" property until we come up with a better mechanism for
>>>>> binding PHYs to generic drivers before a consumer tries to attach to it.
>>>>>
>>>>> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
>>>>
>>>> Thanks for getting this revert submitted, I just ran a bisection this
>>>> afternoon that pointed to this offending commit. It would cause the dead
>>>> lock
>>>
>>> Dead lock in the kernel? Or do you mean just a hang waiting forever for network?
>>
>> It locks up since we try to acquire __device_driver_lock() while we are
>> in the main driver's probe function.
>
> Off the top of my head that sounds weird, but I'll look into the
> logs/code later. Bunch of other stuff and some LPC prep comes first :)
>
>>
>>>
>>>> on boot with drivers/net/dsa/bcm_sf2.c pasted below.
>>>
>>> The log is too jumbled up to be readable (word wrap I suppose) and
>>> maybe even multiple thread printing at the same time.
>>
>> Re-attached (thunderbird is not really helping me).
>
> Thanks!
>
>>
>>>
>>>> Saravana, can
>>>> you CC on me on your fix or what you would want me to be testing?
>>>
>>> By fix, I assume you mean when I bring back phy-handle with a proper
>>> fix to handle the case in the commit text? Yeah, that's going to take
>>> a while. It's brewing in my head and there are some issues that's not
>>> fully resolved. But I haven't had time to code it up or dig into the
>>> net code to make sure it'll work. But yes, I'll CC you when I do so
>>> you can test it with this case.
>>
>> Well by fix, I meant something that does not lock up on my system,
>
> Hold on. Now I'm confused. Are you still hitting hangs/issues after
> the phy-handle patch is reverted?

With the 'phy-handle' patch reverted no, I do not see the hang.

>
>> it is
>> a different problem from supporting 'phy-handle', but it should not
>> regress an existing system, no matter how quirky that system behaves in
>> its probe function. For history and reference, the "offending" change
>> and background can be found here:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=771089c2a485958e423f305e974303760167b45c
>
> So that is the change that's interacting with the phy-handle patch to
> cause the deadlock?
>
> I'm a bit confused on what needs debugging now.

Sorry was not very clear, what I meant to write is that your next
attempt at solving 'phy-handle' should not be regressing on my system,
and I will make sure you get appropriate testing of that patch.
--
Florian