Re: [PATCH] scsi: ufs: qcom: move ufs_qcom_host_reset() to ufs_qcom_device_reset()

From: Can Guo
Date: Fri Dec 01 2023 - 07:14:04 EST


Hi Mani,

On 12/1/2023 1:18 PM, Manivannan Sadhasivam wrote:
On Wed, Nov 29, 2023 at 08:10:57PM +0800, Ziqi Chen wrote:


On 11/28/2023 7:27 PM, Manivannan Sadhasivam wrote:
On Tue, Nov 28, 2023 at 03:40:57AM +0800, Ziqi Chen wrote:


On 11/22/2023 2:14 PM, Can Guo wrote:


On 10/25/2023 3:41 PM, Manivannan Sadhasivam wrote:
On Tue, Oct 24, 2023 at 07:10:15PM +0800, Ziqi Chen wrote:
During PISI test, we found the issue that host Tx still bursting after

What is PISI test?

SI measurement.


Please expand it in the patch description.

Sure, I will update in next patch version.



H/W reset. Move ufs_qcom_host_reset() to ufs_qcom_device_reset() and
reset host before device reset to stop tx burst.


device_reset() callback is supposed to reset only the device and not
the host.
So NACK for this patch.

Agree, the change should come in a more reasonable way.

Actually, similar code is already there in ufs_mtk_device_reset() in
ufs-mediatek.c, I guess here is trying to mimic that fashion.

This change, from its functionality point of view, we do need it,
because I occasionally (2 out of 10) hit PHY error on lane 0 during
reboot test (in my case, I tried SM8350, SM8450 and SM8550, all same).

[    1.911188] [DEBUG]ufshcd_update_uic_error: UECPA:0x80000002
[    1.922843] [DEBUG]ufshcd_update_uic_error: UECDL:0x80004000
[    1.934473] [DEBUG]ufshcd_update_uic_error: UECN:0x0
[    1.944688] [DEBUG]ufshcd_update_uic_error: UECT:0x0
[    1.954901] [DEBUG]ufshcd_update_uic_error: UECDME:0x0

I found out that the PHY error pops out right after UFS device gets
reset in the 2nd init. After having this change in place, the PA/DL
errors are gone.

Hi Mani,

There is another way that adding a new vops that call XXX_host_reset() from
soc vendor driver. in this way, we can call this vops in core layer without
the dependency of device reset.
due to we already observed such error and received many same reports from
different OEMs, we need to fix it in some way.
if you think above way is available, I will update new patch in soon. Or
could you give us other suggestion?


First, please describe the issue in detail. How the issue is getting triggered
and then justify your change. I do not have access to the bug reports that you
received.

From the waveform measured by Samsung , we can see at the end of 2nd Link
Startup, host still keep bursting after H/W reset. This abnormal timing
would cause the PA/DL error mentioned by Can.

On the other hand, at the end of 1st Link start up, Host ends bursting at
first and then sends H/W reset to device. So Samsung suggested to do host
reset before every time device reset to fix this issue. That's what you saw
in this patch. This patch has been verified by OEMs.


Thanks for the detail. This info should have been part of the patch description.

So do you think if we can keep this change with details update in commit
message. or need to do other improvement?


For sure we should not do host reset within device_reset callback. I'd like to
know at what point of time we are seeing the host burst after device reset. I
mean can you point me to the code in the ufshcd driver that when calling
device_reset you are seeing the issue? Then we can do a host_reset before that
_specific_ device_reset with the help of the new vops you suggested.

Actually, anytime when we are about to reset the device, we need to reset host before that, because, as Ziqi mentioned, if host is still bursting after device is reset, it may lead to PA/DL errors. It might be a bit confusing, because host can be bursting some flow control frames and/or dummy frames even when SW thinks it is in idle state.

The reason why the PHY error cannot be easily observed is because that PHY error is non-fatal, it does not trigger error handling, and there is no logs or prints in serial console, meaning it is silent. However, we have error history, in which PHY error can be recorded. Although PHY error is non-fatal, we don't like to see any of it, because our PHY team and customers are requesting zero tolerance to PHY error.

Currently, there are 3 scenarios where host reset should go before device reset -

1. When Linux boots up, in ufshcd_hba_init(), we reset the device. In this case, we need to reset the host before reset the device, because the previous boot stage usually leave the device and host both active before jumping to Linux. This is the first case which this change was made for at the beginning.

2. When the 2nd init kicks start in ufshcd_probe_hba(), we reset the device. In this case, we need to reset the host before reset the device. This is the case which I mentioned in my previous reply.

3. In UFS error handler, we reset the device. In this case, we need to reset the host before reset the device.

Thanks,
Can Guo.


- Mani


-Ziqi


- Mani

-Ziqi


Thanks,
Can Guo.

- Mani

Signed-off-by: Ziqi Chen <quic_ziqichen@xxxxxxxxxxx>
---
  drivers/ufs/host/ufs-qcom.c | 13 +++++++------
  1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/ufs/host/ufs-qcom.c b/drivers/ufs/host/ufs-qcom.c
index 96cb8b5..43163d3 100644
--- a/drivers/ufs/host/ufs-qcom.c
+++ b/drivers/ufs/host/ufs-qcom.c
@@ -445,12 +445,6 @@ static int
ufs_qcom_power_up_sequence(struct ufs_hba *hba)
      struct phy *phy = host->generic_phy;
      int ret;
-    /* Reset UFS Host Controller and PHY */
-    ret = ufs_qcom_host_reset(hba);
-    if (ret)
-        dev_warn(hba->dev, "%s: host reset returned %d\n",
-                  __func__, ret);
-
      /* phy initialization - calibrate the phy */
      ret = phy_init(phy);
      if (ret) {
@@ -1709,6 +1703,13 @@ static void ufs_qcom_dump_dbg_regs(struct
ufs_hba *hba)
  static int ufs_qcom_device_reset(struct ufs_hba *hba)
  {
      struct ufs_qcom_host *host = ufshcd_get_variant(hba);
+    int ret = 0;
+
+    /* Reset UFS Host Controller and PHY */
+    ret = ufs_qcom_host_reset(hba);
+    if (ret)
+        dev_warn(hba->dev, "%s: host reset returned %d\n",
+                  __func__, ret);
      /* reset gpio is optional */
      if (!host->device_reset)
--
2.7.4