Re: [PATCH] usb: xhci: add xhci_halt() for HCE Handling

From: Dayu Jiang

Date: Fri Feb 27 2026 - 02:26:58 EST


On Thu, Feb 26, 2026 at 06:17:23PM +0000, Thinh Nguyen wrote:
> On Thu, Feb 26, 2026, Mathias Nyman wrote:
> > On 2/26/26 11:27, Dayu Jiang wrote:
> > > Hi Greg,
> > >
> > > I have updated the changelog text as requested and resubmitted the patch.
> > > https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20260128100746.561626-1-jiangdayu@xxxxxxxxxx/__;!!A4F2R9G_pg!ZSJNDKyOinm26qngopLW-axiQtwDAMely4bDqtqYDGv1ErWCtS6kZ6ZamdiKoZKuCyCk0IxMQK5g625GEIxYWFzKpAEiCUq7$
> > > Please kindly review it and let me know if it is acceptable now.
> >
> > I'll send it forward, but changed the commit message.
> > Does this modified version still describe the case accurately:
> >
> > usb: xhci: Prevent interrupt storm on host controller error (HCE)
> >
> > The xHCI controller reports a Host Controller Error (HCE) in UAS Storage
> > Device plug/unplug scenarios on Android devices, which is checked in
> > xhci_irq() function and causes an interrupt storm (since the interrupt
> > isn’t cleared), leading to severe system-level faults.
> >
> > When the xHC controller reports HCE in the interrupt handler, the driver
> > only logs a warning and assumes xHC activity will stop. The interrupt storm
> > does however continue until driver manually disables xHC interrupt and
> > stops the controller by calling xhci_halt().
> >
> > The change is made in xhci_irq() function where STS_HCE status is
> > checked, mirroring the existing error handling pattern used for
> > STS_FATAL errors.
> >
> > This only fixes the interrupt storm. Proper HCE recovery requires resetting
> > and re-initializing the xHC.
> >
>
> The controller is halted if there's an error like HCE. It's odd to try
> to "halt" it again. Not sure how this will impact for other controllers.
> Even if we don't have the full HCE recovery implemented, did we try to
> just do HCRST, which is the first step of the recovery?
A full recovery will not be implemented here. Performing only HCRST without
a proper recovery procedure may introduce unpredictable risks.
In the xHCI driver flow, the standard handling for exceptions is mainly
done via xhci_died() or xhci_halt() (please refer to the existing handling
flow for HSE as a reference).
When an HCE occurs, the controller is already halted, but the interrupts
have not been cleared. It has been confirmed that calling xhci_halt() at this
point can properly resolve the interrupt storm issue.
> BR,
> Thinh