Re: [PATCH] usb: xhci: add xhci_halt() for HCE Handling

From: Thinh Nguyen

Date: Fri Feb 27 2026 - 19:23:27 EST


On Fri, Feb 27, 2026, Dayu Jiang wrote:
> On Thu, Feb 26, 2026 at 06:17:23PM +0000, Thinh Nguyen wrote:
> > On Thu, Feb 26, 2026, Mathias Nyman wrote:
> > > On 2/26/26 11:27, Dayu Jiang wrote:
> > > > Hi Greg,
> > > >
> > > > I have updated the changelog text as requested and resubmitted the patch.
> > > > https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20260128100746.561626-1-jiangdayu@xxxxxxxxxx/__;!!A4F2R9G_pg!ZSJNDKyOinm26qngopLW-axiQtwDAMely4bDqtqYDGv1ErWCtS6kZ6ZamdiKoZKuCyCk0IxMQK5g625GEIxYWFzKpAEiCUq7$
> > > > Please kindly review it and let me know if it is acceptable now.
> > >
> > > I'll send it forward, but changed the commit message.
> > > Does this modified version still describe the case accurately:
> > >
> > > usb: xhci: Prevent interrupt storm on host controller error (HCE)
> > >
> > > The xHCI controller reports a Host Controller Error (HCE) in UAS Storage
> > > Device plug/unplug scenarios on Android devices, which is checked in
> > > xhci_irq() function and causes an interrupt storm (since the interrupt
> > > isn’t cleared), leading to severe system-level faults.
> > >
> > > When the xHC controller reports HCE in the interrupt handler, the driver
> > > only logs a warning and assumes xHC activity will stop. The interrupt storm
> > > does however continue until driver manually disables xHC interrupt and
> > > stops the controller by calling xhci_halt().
> > >
> > > The change is made in xhci_irq() function where STS_HCE status is
> > > checked, mirroring the existing error handling pattern used for
> > > STS_FATAL errors.
> > >
> > > This only fixes the interrupt storm. Proper HCE recovery requires resetting
> > > and re-initializing the xHC.
> > >
> >
> > The controller is halted if there's an error like HCE. It's odd to try
> > to "halt" it again. Not sure how this will impact for other controllers.
> > Even if we don't have the full HCE recovery implemented, did we try to
> > just do HCRST, which is the first step of the recovery?
> A full recovery will not be implemented here. Performing only HCRST without
> a proper recovery procedure may introduce unpredictable risks.

What risks?

> In the xHCI driver flow, the standard handling for exceptions is mainly
> done via xhci_died() or xhci_halt() (please refer to the existing handling
> flow for HSE as a reference).
> When an HCE occurs, the controller is already halted, but the interrupts
> have not been cleared. It has been confirmed that calling xhci_halt() at this
> point can properly resolve the interrupt storm issue.

As I noted in Mathias's reply, I'm OK with this change while waiting for
the proper handling of HCE to be implemented.

BR,
Thinh