Re: [PATCH v2] usbhid: tolerate intermittent errors

From: Alan Stern

Date: Sun Mar 08 2026 - 11:19:52 EST


On Sun, Mar 08, 2026 at 02:48:42PM +0100, Liam Mitchell wrote:
> On Sat, 7 Mar 2026 at 23:53, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> > Do you think a better approach might be to reduce the 13-ms delay to
> > just 1 or 2 ms, and perform a reset only there has been no successful
> > communication for about one second? This might perhaps be _too_ lenient
> > sometimes, but I don't think such situations will arise in practice.
>
> I would prefer to have at least the first error resubmit immediately.
> I want to reduce device downtime and missed events. From that
> perspective, I want to assume the error is intermittent unless we see
> evidence otherwise.

Okay; a single immediate resubmission won't cause any problems.

> The current reset logic "reset only if there has been no successful
> communication for one second" is problematic because there is no sign
> of successful communication if the user isn't pressing keys or moving
> the mouse. Two EPROTO errors 1.4 seconds apart will trigger device
> reset and 100-200ms of downtime when ideally URBs would be immediately
> resubmitted with only a few ms of downtime.
>
> Can we infer from not receiving errors that we have successful
> communication? That might change the equation. If we don't receive
> errors for say 10x the polling interval, can we assume it is working?

Pretty much, yes. If the communication is not working at all (for
example, if the device was unplugged) then an interrupt URB will fail
within three polling intervals. 10 intervals seems like a reasonable
limit.

> Ideally the reset is only triggered when we are very sure the device
> is not working and needs it.

Agreed. I don't know how frequently the bad states that HID devices get
into can be fixed by a reset, but I suspect it's not very frequent at
all.

> > The reason for having at least a small delay is to avoid getting into a
> > tight resubmit/error loop in cases where the device has been unplugged.
> >
> > Alan Stern
>
> This patch would only allow one immediate resubmission per window
> (500ms). How costly is a URB submission? I was assuming they are
> relatively cheap and even one per 100ms wouldn't cause problems.

This problem mainly shows up in syzbot testing. Submission isn't all
that expensive, but in the virtual environment used by syzbot, failure
occurs during or shortly after submission. If resubmission is then
immediate after failure, the whole thing becomes an unending tight loop
executing mostly in atomic context, which ties up a CPU long enough to
trigger a warning about a possible kernel hang.

Alan Stern