Re: [BUG BISECT] phy: rockchip-inno-usb2: Sync initial otg state
From: Michael Riesch
Date: Mon Aug 22 2022 - 03:01:22 EST
Hi Peter,
On 8/20/22 12:23, Peter Geis wrote:
>
>
> On 8/17/2022 4:25 AM, Michael Riesch wrote:
>> Hi Peter,
>>
>> On 8/16/22 17:27, Peter Geis wrote:
>>> On Tue, Aug 16, 2022 at 11:20 AM Michael Riesch
>>> <michael.riesch@xxxxxxxxxxxxxx> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> On 8/4/22 11:49, Peter Geis wrote:
>>>>> On Tue, Aug 2, 2022 at 2:39 PM Markus Reichl
>>>>> <m.reichl@xxxxxxxxxxxxx> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> with linux-next-20220728 rk3399-roc-pc does not boot.
>>>>>> Bisecting pointed to this commit.
>>>>>> By reverting this commit the board boots again.
>>>>>
>>>>> Thank you for reporting this, someone was kind enough to reproduce the
>>>>> problem on the rockpro64 and confirmed this is an issue. As I won't
>>>>> have access to my hardware until next month, we should probably revert
>>>>> this until the root cause can be identified.
>>>>
>>>> Just experienced this issue on my ROCK3 Model A board (RK3568) and
>>>> reverting this commit solved it.
>>>>
>>>> Having the revert in v6.0-rc2 would be great -- if there is anything I
>>>> can help to accelerate this please let me know.
>>>
>>> If this is now happening on rk356x where I know it works, it now
>>> cements my theory that it's a symptom and not the actual problem.
>>> Possibly a race condition with the grf and regmap code where it isn't
>>> quite ready when called. This code path is called exactly the same way
>>> later on when the irq fires.
>>>
>>> What config are you based on? I'm running a stripped down version of
>>> the arm64_defconfig, but if you deviate from that it will be helpful
>>> in reproducing the issue.
>>
>> I posted my Kconfig here: https://pastebin.com/P1As0W4k
>>
>> FWIW the ROCK3 board has a switch to set the OTG port to device or host,
>> respectively. The NPE does not occur when the switch is set to host.
>>
>> Best regards,
>> Michael
>
> Good Afternoon Michael,
>
> Please try the following fix.
>
> Very Respectfully,
> Peter Geis
>
> diff --git a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
> b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
> index 0b1e9337ee8e..5fc7c374a6b4 100644
> --- a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
> +++ b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
> @@ -1169,6 +1169,7 @@ static int rockchip_usb2phy_otg_port_init(struct
> rockchip_usb2phy *rphy,
> /* do initial sync of usb state */
> ret = property_enabled(rphy->grf, &rport->port_cfg->utmi_id);
> extcon_set_state_sync(rphy->edev, EXTCON_USB_HOST, !ret);
> + ret = 0;
> }
> }
Thanks, this patch indeed solves the issue in my setup. With both role
switch settings the NPE does not occur anymore, the correct role is
selected, and both roles work (tested with zerogadget (device) and a USB
drive (host)). Can you spin a patch?
NB: On the ROCK3 the device tree needs a fix to get the host role going,
I'll need to take a closer look on this one and spin a patch.
What I still find strange (but is unrelated to the commit "phy:
rockchip-inno-usb2: Sync initial otg state") is that two and four xhci
controllers pop up in the device role and the host role, respectively.
For example, in the device role there is a pair of controllers
# lsusb | grep xhci
Bus 006 Device 001: ID 1d6b:0003 Linux 6.0.0-rc1+ xhci-hcd xHCI Host
Controller
Bus 005 Device 001: ID 1d6b:0002 Linux 6.0.0-rc1+ xhci-hcd xHCI Host
Controller
# hexdump /sys/bus/usb/devices/usb5/of_node/reg
0000000 0000 0000 00fd 0000 0000 0000 4000 0000
0000010
# hexdump /sys/bus/usb/devices/usb6/of_node/reg
0000000 0000 0000 00fd 0000 0000 0000 4000 0000
0000010
that are related to the same device (in this case usb_host1_xhci). I
would have expected a single controller. Anyone care to enlighten me a
bit why there is a pair of them?
Thanks and best regards,
Michael
>>> We should revert it until it's isolated, as well as the patch setting
>>> the rk356x to otg since it will again be broken. If someone could
>>> weigh in here as well (I currently don't have access to my hardware)
>>> it would be helpful.
>>>
>>>>
>>>> Thanks and best regards,
>>>> Michael
>>>>
>>>>>
>>>>> Very Respectfully,
>>>>> Peter Geis
>>>>>
>>>>>>
>>>>>> [ 2.398700] Unable to handle kernel NULL pointer dereference at
>>>>>> virtual address
>>>>>> 0000000000000008
>>>>>> [ 2.399517] Mem abort info:
>>>>>> [ 2.399772] ESR = 0x0000000096000004
>>>>>> [ 2.400114] EC = 0x25: DABT (current EL), IL = 32 bits
>>>>>> [ 2.400594] SET = 0, FnV = 0
>>>>>> [ 2.400873] EA = 0, S1PTW = 0
>>>>>> [ 2.401161] FSC = 0x04: level 0 translation fault
>>>>>> [ 2.401602] Data abort info:
>>>>>> [ 2.401864] ISV = 0, ISS = 0x00000004
>>>>>> [ 2.402212] CM = 0, WnR = 0
>>>>>> [ 2.402484] user pgtable: 4k pages, 48-bit VAs,
>>>>>> pgdp=0000000001376000
>>>>>> [ 2.403071] [0000000000000008] pgd=0000000000000000,
>>>>>> p4d=0000000000000000
>>>>>> [ 2.403687] Internal error: Oops: 96000004 [#1] SMP
>>>>>> [ 2.404130] Modules linked in: ip_tables x_tables ipv6
>>>>>> xhci_plat_hcd xhci_hcd
>>>>>> dwc3 rockchipdrm drm_cma_helper analogix_dp dw_hdmi realtek
>>>>>> drm_display_helper
>>>>>> dwc3_of_simple dw_mipi_dsi ehci_platform ohci_platform ohci_hcd
>>>>>> ehci_hcd
>>>>>> drm_kms_helper dwmac_rk syscopyarea sysfillrect stmmac_platform
>>>>>> sysimgblt
>>>>>> fb_sys_fops usbcore stmmac pcs_xpcs drm phylink
>>>>>> drm_panel_orientation_quirks
>>>>>> [ 2.407155] CPU: 4 PID: 71 Comm: kworker/4:6 Not tainted
>>>>>> 5.19.0-rc8-next-20220728 #437
>>>>>> [ 2.407868] Hardware name: Firefly ROC-RK3399-PC Mezzanine
>>>>>> Board (DT)
>>>>>> [ 2.408448] Workqueue: events rockchip_usb2phy_otg_sm_work
>>>>>> [ 2.408958] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT
>>>>>> -SSBS BTYPE=--)
>>>>>> [ 2.411634] pc : rockchip_usb2phy_otg_sm_work+0x50/0x330
>>>>>> [ 2.414332] lr : process_one_work+0x1d8/0x380
>>>>>> [ 2.416948] sp : ffff800009373d60
>>>>>> [ 2.419406] x29: ffff800009373d60 x28: 0000000000000000 x27:
>>>>>> 0000000000000000
>>>>>> [ 2.422199] x26: ffff0000f779fcb8 x25: ffff0000f77a3a05 x24:
>>>>>> 000000000000000c
>>>>>> [ 2.424978] x23: 0000000000000000 x22: ffff0000010c8258 x21:
>>>>>> ffff80000888ec10
>>>>>> [ 2.427768] x20: ffff0000010c82f0 x19: 000000000000000c x18:
>>>>>> 0000000000000001
>>>>>> [ 2.430604] x17: 000000040044ffff x16: 00400034b5503510 x15:
>>>>>> 0000000000000000
>>>>>> [ 2.433390] x14: ffff000000708000 x13: ffff8000eec96000 x12:
>>>>>> 0000000034d4d91d
>>>>>> [ 2.436185] x11: 0000000000000000 x10: 0000000000000a10 x9 :
>>>>>> ffff000001aa7a74
>>>>>> [ 2.438958] x8 : fefefefefefefeff x7 : 0000000000000018 x6 :
>>>>>> ffff000001aa7a74
>>>>>> [ 2.441668] x5 : 000073746e657665 x4 : 000000000000002f x3 :
>>>>>> ffff00000356c808
>>>>>> [ 2.444407] x2 : ffff800009373da4 x1 : 000000000000e2ac x0 :
>>>>>> ffff80000888eb34
>>>>>> [ 2.447190] Call trace:
>>>>>> [ 2.449557] rockchip_usb2phy_otg_sm_work+0x50/0x330
>>>>>> [ 2.452169] process_one_work+0x1d8/0x380
>>>>>> [ 2.454684] worker_thread+0x170/0x4e0
>>>>>> [ 2.457056] kthread+0xd8/0xdc
>>>>>> [ 2.459354] ret_from_fork+0x10/0x20
>>>>>> [ 2.461728] Code: 91037015 295be001 f9403c77 b940e413 (f94006e0)
>>>>>> [ 2.464338] ---[ end trace 0000000000000000 ]---
>>>>>>
>>>>>> Am 22.06.22 um 02:31 schrieb Peter Geis:
>>>>>>> The initial otg state for the phy defaults to device mode. The
>>>>>>> actual
>>>>>>> state isn't detected until an ID IRQ fires. Fix this by syncing
>>>>>>> the ID
>>>>>>> state during initialization.
>>>>>>>
>>>>>>> Fixes: 51a9b2c03dd3 ("phy: rockchip-inno-usb2: Handle ID IRQ")
>>>>>>> Signed-off-by: Peter Geis <pgwipeout@xxxxxxxxx>
>>>>>>> ---
>>>>>>> drivers/phy/rockchip/phy-rockchip-inno-usb2.c | 6 ++++++
>>>>>>> 1 file changed, 6 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
>>>>>>> b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
>>>>>>> index 6711659f727c..6e44069617df 100644
>>>>>>> --- a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
>>>>>>> +++ b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
>>>>>>> @@ -1162,6 +1162,12 @@ static int
>>>>>>> rockchip_usb2phy_otg_port_init(struct rockchip_usb2phy *rphy,
>>>>>>> EXTCON_USB_HOST,
>>>>>>> &rport->event_nb);
>>>>>>> if (ret)
>>>>>>> dev_err(rphy->dev, "register USB HOST
>>>>>>> notifier failed\n");
>>>>>>> +
>>>>>>> + if (!of_property_read_bool(rphy->dev->of_node,
>>>>>>> "extcon")) {
>>>>>>> + /* do initial sync of usb state */
>>>>>>> + ret = property_enabled(rphy->grf,
>>>>>>> &rport->port_cfg->utmi_id);
>>>>>>> + extcon_set_state_sync(rphy->edev,
>>>>>>> EXTCON_USB_HOST, !ret);
>>>>>>> + }
>>>>>>> }
>>>>>>>
>>>>>>> out:
>>>>>>
>>>>>> Gruß,
>>>>>> --
>>>>>> Markus Reichl
>>>>>
>>>>> _______________________________________________
>>>>> Linux-rockchip mailing list
>>>>> Linux-rockchip@xxxxxxxxxxxxxxxxxxx
>>>>> http://lists.infradead.org/mailman/listinfo/linux-rockchip