Re: xhci_reset_endpoint() doesn't reset endpoint

From: Michal Necasek
Date: Thu Dec 15 2016 - 10:42:29 EST


On 12/14/2016 3:28 PM, Mathias Nyman wrote:
On 14.12.2016 12:58, Michal Necasek wrote:
prior to the endpoint reset. SetFeature(CLEAR_HALT) resets the toggle
on the device, but not on the host. But we know for a fact that the
device sends a packet (with data toggle 0) which the host USB stack
never sees, and a data toggle mismatch explains that quite well.

We are using USBFS to talk to the printer, but that shouldn't matter
much. I will note that the available documentation<1> explicitly says
that USBDEVFS_RESETEP and USBDEVFS_CLEAR_HALT both reset the data
toggle. That is indeed the case for the Linux EHCI driver but not
xHCI. Both of the USBFS IOCTLs call into xhci_reset_endpoint() which
does nothing.


This is very likely the case.

Thanks for confirming that.

xhci can not reset the host side of the endpoint unless it really is
halted.
xhci 4.6.8:

"If the endpoint is not in the Halted state when an Reset Endpoint Command
is executed -The xHC shall reject the command and generate a Command
Completion Event with the Completion Code set to Context State Error."

Right, the Reset Endpoint is limited to halted endpoints, not even stopped ones. It's really only useful for error cleanup.

Normal halt/stall case is that xhci receives a STALL from the device,
and immediately resets the endpoint (clears toggle, host side) then
propagates the HALT status to usb core.
USB core then sends SetFeature(CLEAR_HALT) to the device which will
reset the
toggle for the device side of the endpoint, and host and device toggles
will be in sync.

After this xhci_endpoint_reset() is called by usb core to inform xhci
that the
endpoint was reset, but currently we don't do anything in it.

OK, that's what I thought was happening.

If SetFeature(CLEAR_HALT) is called without endpoint actually being
HALTED we can not
reset it from xhci. we should issue a config endpoint command to reset
the host side
toggle, as mentioned in xhci 1.0 120814 as a last note:

"Note: The Reset Endpoint Command may only be issued to endpoints in the
Halted state.
If software wishes reset the Data Toggle or Sequence Number of an
endpoint that isn't
in the Halted state, then software may issue a Configure Endpoint
Command with the Drop
and Add bits set for the target endpoint. that is in the Stopped state."

There was a case with a scanner we believed had the same issue, and we
tried to
resolve it by issuing the configure endpoint command in
xhci_endpoint_reset() but
if I remember correctly It did not resolve the case and code never got
anywhere.

Ah, that's interesting. I'm not surprised that we weren't the first ones running into this. The scenario is a bit obscure but not very device specific.

I might have some really old implementation somewhere for this, at least
there is
a really old and outdated hack at


git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git
ep_reset_halt_test
https://git.kernel.org/cgit/linux/kernel/git/mnyman/xhci.git/log/?h=ep_reset_halt_test


which really is quite a hack, and based on 3.19 kernel so it's probably
only useful
as an Idea to base a real solution on.

Thanks, we'll have to try it out. It looks like the right approach to me.

In the meantime, I did a little survey of other operating systems. I know Windows can do it (reset the data toggle for non-halted endpoint), because the printer works just fine on a Windows host, using both Intel xHCI drivers (Windows 7) and Microsoft xHCI drivers (Windows 10). And that is true whether the printer is attached to the host or passed through to a VM. I do not yet fully understand how Windows does it.

Solaris can't do it. FreeBSD can I believe, but the FreeBSD method is very heavy-handed (they just reset everything in sight that's remotely related to the endpoint).

I also looked at Apple's xHCI driver <1> (not new, but the most recent one published) and found that they do exactly what the xHCI 1.1 spec suggests, doing a dummy Configure Endpoint command which drops and re-adds the endpoint without explicitly changing its state. There are even some interesting comments: "State is not halted, so the quiesce endpoint did not clear the toggles. This will clear the toggles", and "Now do configure endpoint with both add and drop flags set."


Thanks,
Michal

1:

https://opensource.apple.com/source/IOUSBFamily/IOUSBFamily-560.4.2/AppleUSBXHCI/Classes/AppleUSBXHCIUIM.cpp.auto.html