Re: [PATCH v4] usb: gadget: u_ether: Replace netif_stop_queue with netif_device_detach

From: Hardik Gajjar
Date: Tue Apr 30 2024 - 11:34:33 EST


On Sun, Apr 28, 2024 at 11:07:36PM +0200, Ferry Toth wrote:
> Hi,
>
> Op 25-04-2024 om 23:27 schreef Ferry Toth:
> > Hi,
> >
> > Op 17-04-2024 om 17:13 schreef Hardik Gajjar:
> > > On Tue, Apr 16, 2024 at 04:48:32PM +0300, Andy Shevchenko wrote:
> > > > On Thu, Apr 11, 2024 at 10:52:36PM +0200, Ferry Toth wrote:
> > > > > Op 11-04-2024 om 18:39 schreef Andy Shevchenko:
> > > > > > On Thu, Apr 11, 2024 at 04:26:37PM +0200, Hardik Gajjar wrote:
> > > > > > > On Wed, Apr 10, 2024 at 08:37:42PM +0300, Andy Shevchenko wrote:
> > > > > > > > On Sun, Apr 07, 2024 at 10:51:51PM +0200, Ferry Toth wrote:
> > > > > > > > > Op 05-04-2024 om 13:38 schreef Hardik Gajjar:
> > > >
> > > > ...
> > > >
> > > > > > > > > Exactly. And this didn't happen before the 2 patches.
> > > > > > > > >
> > > > > > > > > To be precise: /sys/class/net/usb0 is not
> > > > > > > > > removed and it is a link, the link
> > > > > > > > > target /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/net/usb0
> > > > > > > > > no
> > > > > > > > > longer exists
> > > > > > > So, it means that the /sys/class/net/usb0 is
> > > > > > > present, but the symlink is
> > > > > > > broken. In that case, the dwc3 driver should
> > > > > > > recreate the device, and the
> > > > > > > symlink should become active again
> > > > >
> > > > > Yes, on first enabling gadget (when device mode is activated):
> > > > >
> > > > > root@yuna:~# ls
> > > > > /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/
> > > > > driver  net  power  sound  subsystem  suspended  uevent
> > > > >
> > > > > Then switching to host mode:
> > > > >
> > > > > root@yuna:~# ls
> > > > > /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/
> > > > > ls: cannot access
> > > > > '/sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/':
> > > > > No such file
> > > > > or directory
> > > > >
> > > > > Then back to device mode:
> > > > >
> > > > > root@yuna:~# ls
> > > > > /sys/devices/pci0000:00/0000:00:11.0/dwc3.0.auto/gadget.0/
> > > > > driver  power  sound  subsystem  suspended  uevent
> > > > >
> > > > > net is missing. But, network functions:
> > > > >
> > > > > root@yuna:~# ping 10.42.0.1
> > > > > PING 10.42.0.1 (10.42.0.1): 56 data bytes
> > > > >
> > > > > Mass storage device is created and removed each time as expected.
> > > >
> > > > So, what's the conclusion? Shall we move towards revert of those
> > > > two changes?
> > >
> > >
> > > As promised, I have the tested the this patch with the dwc3 gadget.
> > > I could not reproduce
> > > the issue.
> > >
> > > I can see the usb0 exist all the time and accessible regardless of
> > > the role switching of the USB mode (peripheral <-> host)
> > >
> > > Following are the logs:
> > > //Host to device
> > >
> > > console:/sys/bus/platform/devices/a800000.ssusb # echo "peripheral"
> > > > mode
> > > console:/sys/bus/platform/devices/a800000.ssusb # ls
> > > a800000.dwc3/gadget/net/
> > > usb0
> > >
> > > //device to host
> > > console:/sys/bus/platform/devices/a800000.ssusb # echo "host" > mode
> > > console:/sys/bus/platform/devices/a800000.ssusb # ls
> > > a800000.dwc3/gadget/net/
> > > usb0
> >
> > That is weird. When I switch to host mode (using the physical switch),
> > the whole gadget directory is removed (now testing 6.9.0-rc5)
> >
> > Switching back to device mode, that gadget directory is recreated. And
> > gadget/sound as well, but not gadget/net.
> >
> > > s a800000.dwc3/gadget/net/usb0                                               
> > > <
> > > addr_assign_type    duplex             phys_port_name
> > > addr_len            flags              phys_switch_id
> > > address             gro_flush_timeout  power
> > > broadcast           ifalias            proto_down
> > > carrier             ifindex            queues
> > > carrier_changes     iflink             speed
> > > carrier_down_count  link_mode          statistics
> > > carrier_up_count    mtu                subsystem
> > > dev_id              name_assign_type   tx_queue_len
> > > dev_port            netdev_group       type
> > > device              operstate          uevent
> > > dormant             phys_port_id       waiting_for_supplier
> > > console:/sys/bus/platform/devices/a800000.ssusb # ifconfig -a usb0
> > > usb0      Link encap:Ethernet  HWaddr 3a:8b:63:97:1a:9a
> > >            BROADCAST MULTICAST  MTU:1500  Metric:1
> > >            RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > >            TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> > >            collisions:0 txqueuelen:1000
> > >            RX bytes:0 TX bytes:0
> > >
> > > console:/sys/bus/platform/devices/a800000.ssusb #
> > >
> > > I strongly advise against reverting the patch solely based on the
> > > observed issue of removing the /sys/class/net/usb0 directory while
> > > the usb0 interface remains available.
> >
> > There's more to it. I also mentioned that switching the role or
> > unplugging the cable leaves the usb0 connection.
> >
> > I have while in host mode:
> > root@yuna:~# ifconfig -a usb0
> > usb0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC>  mtu 1500
> >         inet 10.42.0.221  netmask 255.255.255.0  broadcast 10.42.0.255
> >         inet6 fe80::a8bb:ccff:fedd:eef1  prefixlen 64  scopeid 0x20<link>
> >
> >
> > You don't see that because you didn't create a connection at all.
> >
> > > Instead, I recommend enabling FTRACE to trace the functions involved
> > > and identify which faulty call is responsible for removing usb0.
> >
> > Switching from device -> host -> device:
> >
> > root@yuna:~# trace-cmd record -p function_graph -l *gether_*
> >   plugin 'function_graph'
> > Hit Ctrl^C to stop recording
> > ^CCPU0 data recorded at offset=0x1c8000
> >     188 bytes in size (4096 uncompressed)
> > CPU1 data recorded at offset=0x1c9000
> >     0 bytes in size (0 uncompressed)
> > root@yuna:~# trace-cmd report
> > cpus=2
> >      irq/68-dwc3-725   [000]   514.575337: funcgraph_entry:      #
> > 2079.480 us |  gether_disconnect();
> >      irq/68-dwc3-946   [000]   524.263731: funcgraph_entry:      +
> > 11.640 us  |  gether_disconnect();
> >      irq/68-dwc3-946   [000]   524.263743: funcgraph_entry:      !
> > 116.520 us |  gether_connect();
> >      irq/68-dwc3-946   [000]   524.268029: funcgraph_entry:      #
> > 2057.260 us |  gether_disconnect();
> >      irq/68-dwc3-946   [000]   524.270089: funcgraph_entry:      !
> > 109.000 us |  gether_connect();
>
> I tried to get a more useful trace:
> root@yuna:/sys/kernel/tracing# echo 'gether_*' > set_ftrace_filter
> root@yuna:/sys/kernel/tracing# echo 'eem_*' >> set_ftrace_filter
> root@yuna:/sys/kernel/tracing# echo function > current_tracer
> root@yuna:/sys/kernel/tracing# echo 'reset_config' >> set_ftrace_filter
> -> switch to host mode then back to device
> root@yuna:/sys/kernel/tracing# cat trace
> # tracer: function
> #
> # entries-in-buffer/entries-written: 53/53 #P:2
> #
> # _-----=> irqs-off/BH-disabled
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / _-=> migrate-disable
> # |||| / delay
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
> irq/68-dwc3-523 [000] D..3. 133.990254: reset_config
> <-__composite_disconnect
> irq/68-dwc3-523 [000] D..3. 133.992274: eem_disable
> <-reset_config
> irq/68-dwc3-523 [000] D..3. 133.992276: gether_disconnect
> <-reset_config
> kworker/1:3-443 [001] ...1. 134.022453: eem_unbind
> <-purge_configs_funcs
>
> -> to device mode
>
> kworker/1:3-443 [001] ...1. 148.630773: eem_bind
> <-usb_add_function
> irq/68-dwc3-734 [000] D..3. 149.155209: eem_set_alt
> <-composite_setup
> irq/68-dwc3-734 [000] D..3. 149.155215: gether_disconnect
> <-eem_set_alt
> irq/68-dwc3-734 [000] D..3. 149.155220: gether_connect
> <-eem_set_alt
> irq/68-dwc3-734 [000] D..3. 149.157287: eem_set_alt
> <-composite_setup
> irq/68-dwc3-734 [000] D..3. 149.157292: gether_disconnect
> <-eem_set_alt
> irq/68-dwc3-734 [000] D..3. 149.159338: gether_connect
> <-eem_set_alt
> irq/68-dwc3-734 [000] D..2. 149.239625: eem_unwrap <-rx_complete
> ...
>
> I don't know where to look exactly. Any hints?

do you see anything related to gether_cleanup() after eem_unbind() ?
If not then, you may try to enable tracing of TCP/IP stack and network side to check who deleting the sysfs entry

Hardik


>
> >
> > > According to current kernel architecture of u_ether driver, only
> > > gether_cleanup should remove the usb0 interface along with its
> > > kobject and sysfs interface.
> > > I suggest sharing the analysis here to understand why this practice
> > > is not followed in your use case or driver ?
> >
> > Yes, I'll try to trace where that happens.
> >
> > Nevertheless, the disappearance of the net/usb0 directory seems
> > harmless? But the usb: net device remaining after disconnect or role
> > switch is not good, as the route remains.
> >
> > May be they are 2 separate problems. Could you try to reproduce what
> > happens if you make eem connection and then unplug?
> >
> > > I am curious why the driver was developed without adhering to the
> > > kernel's gadget architecture.
>
> I don't know what you mean here. Which driver do you mean?
>
> > > >
> > > > > > > I have the dwc3 IP base usb controller, Let me check
> > > > > > > with this patch and
> > > > > > > share result here.  May be we need some fix in dwc3
> > > > > Would have been nice if someone could test on other
> > > > > controller as well. But
> > > > > another instance of dwc3 is also very welcome.
> > > > > > It's quite possible, please test on your side.
> > > > > > We are happy to test any fixes if you come up with.
> > > >
> > > > --
> > > > With Best Regards,
> > > > Andy Shevchenko
> > > >
> > > >
> >