Re: [PATCH] bus: mhi: pci_generic: Fix device recovery failed issue

From: Manivannan Sadhasivam
Date: Thu Nov 18 2021 - 01:19:25 EST


On Mon, Nov 08, 2021 at 07:31:27PM +0800, Slark Xiao wrote:
> For Foxconn T99W175 device(sdx55 platform) in some host
> platform, it would be unavailable once the host execute
> the err handler.
> After checking, it's caused by the delay time too short to
> get a successful reset.
>
> Please see my test evidence as bewlow(BTW, I add some
> extra test logs in function mhi_pci_reset_prepare and
> mhi_pci_reset_done):
> When MHI_POST_RESET_DELAY_MS equals to 500ms:
> Nov 4 14:30:03 jbd-ThinkEdge kernel: [ 146.222477] mhi mhi0: Device MHI is not in valid state
> Nov 4 14:30:03 jbd-ThinkEdge kernel: [ 146.222628] mhi-pci-generic 0000:2d:00.0: mhi_pci_reset_prepare reset
> Nov 4 14:30:03 jbd-ThinkEdge kernel: [ 146.222631] mhi-pci-generic 0000:2d:00.0: mhi_pci_reset_prepare mhi_soc_reset
> Nov 4 14:30:03 jbd-ThinkEdge kernel: [ 146.222632] mhi mhi0: mhi_soc_reset write soc to reset
> Nov 4 14:30:05 jbd-ThinkEdge kernel: [ 147.839993] mhi-pci-generic 0000:2d:00.0: mhi_pci_reset_done
> Nov 4 14:30:05 jbd-ThinkEdge kernel: [ 147.902063] mhi-pci-generic 0000:2d:00.0: reset failed
>
> When MHI_POST_RESET_DELAY_MS equals to 1000ms or 1500ms:
> Nov 4 19:07:26 jbd-ThinkEdge kernel: [ 157.067857] mhi mhi0: Device MHI is not in valid state
> Nov 4 19:07:26 jbd-ThinkEdge kernel: [ 157.068029] mhi-pci-generic 0000:2d:00.0: mhi_pci_reset_prepare reset
> Nov 4 19:07:26 jbd-ThinkEdge kernel: [ 157.068032] mhi-pci-generic 0000:2d:00.0: mhi_pci_reset_prepare mhi_soc_reset
> Nov 4 19:07:26 jbd-ThinkEdge kernel: [ 157.068034] mhi mhi0: mhi_soc_reset write soc to reset
> Nov 4 19:07:29 jbd-ThinkEdge kernel: [ 159.607006] mhi-pci-generic 0000:2d:00.0: mhi_pci_reset_done
> Nov 4 19:07:29 jbd-ThinkEdge kernel: [ 159.607152] mhi mhi0: Requested to power ON
> Nov 4 19:07:51 jbd-ThinkEdge kernel: [ 181.302872] mhi mhi0: Failed to reset MHI due to syserr state
> Nov 4 19:07:51 jbd-ThinkEdge kernel: [ 181.303011] mhi-pci-generic 0000:2d:00.0: failed to power up MHI controller
>
> When MHI_POST_RESET_DELAY_MS equals to 2000ms:
> Nov 4 17:51:08 jbd-ThinkEdge kernel: [ 147.180527] mhi mhi0: Failed to transition from PM state: Linkdown or Error Fatal Detect to: SYS ERROR Process
> Nov 4 17:51:08 jbd-ThinkEdge kernel: [ 147.180535] mhi mhi0: Device MHI is not in valid state
> Nov 4 17:51:08 jbd-ThinkEdge kernel: [ 147.180722] mhi-pci-generic 0000:2d:00.0: mhi_pci_reset_prepare reset
> Nov 4 17:51:08 jbd-ThinkEdge kernel: [ 147.180725] mhi-pci-generic 0000:2d:00.0: mhi_pci_reset_prepare mhi_soc_reset
> Nov 4 17:51:08 jbd-ThinkEdge kernel: [ 147.180727] mhi mhi0: mhi_soc_reset write soc to reset
> Nov 4 17:51:11 jbd-ThinkEdge kernel: [ 150.230787] mhi-pci-generic 0000:2d:00.0: mhi_pci_reset_done
> Nov 4 17:51:11 jbd-ThinkEdge kernel: [ 150.230928] mhi mhi0: Requested to power ON
> Nov 4 17:51:11 jbd-ThinkEdge kernel: [ 150.231173] mhi mhi0: Power on setup success
> Nov 4 17:51:14 jbd-ThinkEdge kernel: [ 153.254747] mhi mhi0: Wait for device to enter SBL or Mission mode
>
> I also tried big data like 3000, and it worked as well.
> 500ms may not be enough for all support mhi device. We shall
> increase it to 2000ms at least.
>
> Signed-off-by: Slark Xiao <slark_xiao@xxxxxxx>

Reviewed-by: Manivannan Sadhasivam <mani@xxxxxxxxxx>

Thanks,
Mani

> ---
> drivers/bus/mhi/pci_generic.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/bus/mhi/pci_generic.c b/drivers/bus/mhi/pci_generic.c
> index 59a4896a8030..4c577a731709 100644
> --- a/drivers/bus/mhi/pci_generic.c
> +++ b/drivers/bus/mhi/pci_generic.c
> @@ -20,7 +20,7 @@
>
> #define MHI_PCI_DEFAULT_BAR_NUM 0
>
> -#define MHI_POST_RESET_DELAY_MS 500
> +#define MHI_POST_RESET_DELAY_MS 2000
>
> #define HEALTH_CHECK_PERIOD (HZ * 2)
>
> --
> 2.25.1
>