Re: [PATCH 0/2] Disable SS instances in park mode for SC7180/ SC7280

From: Krishna Kurapati PSSNV
Date: Fri May 31 2024 - 10:27:37 EST




On 5/31/2024 7:47 PM, Doug Anderson wrote:
Hi,

On Fri, May 31, 2024 at 5:33 AM Konrad Dybcio <konrad.dybcio@xxxxxxxxxx> wrote:

On 30.05.2024 3:34 PM, Doug Anderson wrote:
Hi,

On Thu, May 30, 2024 at 1:26 AM Krishna Kurapati
<quic_kriskura@xxxxxxxxxxx> wrote:

When working in host mode, in certain conditions, when the USB
host controller is stressed, there is a HC died warning that comes up.
Fix this up by disabling SS instances in park mode for SC7280 and SC7180.

Krishna Kurapati (2):
arm64: dts: qcom: sc7180: Disable SS instances in park mode
arm64: dts: qcom: sc7280: Disable SS instances in park mode

arch/arm64/boot/dts/qcom/sc7180.dtsi | 1 +
arch/arm64/boot/dts/qcom/sc7280.dtsi | 1 +
2 files changed, 2 insertions(+)

FWIW, the test case I used to reproduce this:

1. Plug in a USB dock w/ Ethernet
2. Plug a USB 3 SD card reader into the dock.
3. Use lsusb -t to confirm both Ethernet and card reader are on USB3.
4. From a shell, run for i in $(seq 5); do dd if=/dev/sdb of=/dev/null
bs=4M; done to read from the card reader.
5. At the same time, stress the Internet. If you've got a very fast
Internet connection then running Google's "Internet speed test" did
it, but I could also reproduce by just running this from a PC
connected to the same network as my DUT: ssh ${DUT} "dd of=/dev/null"
< /dev/zero

I would also note that, though I personally reproduced this on sc7180
and sc7280 boards and thus Krishna posted the patch for those boards,
there's no reason to believe that this problem doesn't affect all of
Qualcomm's SoCs. It would be nice if someone at Qualcomm could post a
followup patch fixing this everywhere.

Right, this sounds like a more widespread issue

That said, I couldn't reproduce it on SC8280XP / X13s (which does NOT mean
8280 isn't affected). My setup was:

- USB3 5GB/s hub plugged into one of the side USBs
- on-hub 1 Gb /s network hub connected straight to my router with a
600 / 60 Mbps link, spamming speedtest-cli and dd-over-ssh
- M.2 SSD connected over a USB adapter, nearing 280 MB/s speeds (the
adapter isn't particularly speedy)

So it stands to reason that it might not have been enough to trigger it.

In my case I wasn't using anything nearly as fast as a M.2 SSD. I was
just using a normal USB3 SD card reader. That being said, multiple
people at Qualcomm were able to replicate the issue without lots of
back and forth, so I'd guess that the problem isn't that sensitive to
the exact storage device. I will also note that it's not sensitive to
the exact network device as I replicated it with two Ethernet adapters
with very different chipsets.

My only guess is that somehow SC8280XP is faster and that changes the
timing of how it handles interrupts. I guess you could try capping
your cpufreq in sysfs and see if that makes a difference in
reproducing. ;-) ...or maybe somehow SC8280XP has a newer version of
the IP where they've fixed this?

It would be interesting if someone with a SDM845 dragonboard could try
replicating since that seems highly likely to reproduce, at least.


Hi Konrad, Doug,

Usually on downstream we set this quirk only for all Gen-1 targets (not particularly for this testcase) but to avoid these kind of controller going dead issues. I can filter out the gen-1 targets (other than sc7280/sc7180) and send a separate series to add this quirk in all of them.

Regards,
Krishna,