Re: [PATCH v2 0/7] usb: gadget: Fix net_device lifecycle with device_move

From: Kuen-Han Tsai

Date: Mon Mar 16 2026 - 02:48:19 EST


On Mon, Mar 16, 2026 at 2:36 PM Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Mar 16, 2026 at 02:17:09PM +0800, Kuen-Han Tsai wrote:
> > Hi Luca,
> >
> > On Fri, Mar 13, 2026 at 8:40 PM Luca Weiss <luca.weiss@xxxxxxxxxxxxx> wrote:
> > >
> > > Hi Kuen-Han,
> > >
> > > On Mon Mar 9, 2026 at 1:04 PM CET, Kuen-Han Tsai wrote:
> > > > PROBLEMS
> > > > --------
> > > > The net_device in f_ncm is allocated at function instance creation
> > > > and registered at bind time with the gadget device as its sysfs parent.
> > > > When the gadget unbinds, the parent device is destroyed but the
> > > > net_device survives, leaving dangling sysfs symlinks and a NULL pointer
> > > > dereference when userspace accesses the orphaned interface:
> > > >
> > > > Problem 1: Unable to handle kernel NULL pointer dereference
> > > > Call trace:
> > > > __pi_strlen+0x14/0x150
> > > > rtnl_fill_ifinfo+0x6b4/0x708
> > > > rtmsg_ifinfo_build_skb+0xd8/0x13c
> > > > ...
> > > > netlink_sendmsg+0x2e0/0x3d4
> > > >
> > > > Problem 2: Dangling sysfs symlinks
> > > > console:/ # ls -l /sys/class/net/ncm0
> > > > lrwxrwxrwx ... /sys/class/net/ncm0 ->
> > > > /sys/devices/platform/.../gadget.0/net/ncm0
> > > > console:/ # ls -l /sys/devices/platform/.../gadget.0/net/ncm0
> > > > ls: .../gadget.0/net/ncm0: No such file or directory
> > > >
> > > > BACKGROUND & THE REVERTS
> > > > ------------------------
> > > > The deferred allocation causes a regression for userspace tools during
> > > > network setup (such as the postmarketOS DHCP daemon). By moving the
> > > > allocation out of alloc_inst, configfs returns the name pattern "usb%d"
> > > > instead of the actual interface name (e.g., "usb0") when userspace reads
> > > > the 'ifname' attribute.
> > > >
> > > > Investigating a fix for this naming issue revealed a deeper
> > > > architectural flaw introduced by the series. Deferring the allocation to
> > > > bind() means that a single function instance will spawn multiple network
> > > > devices if it is symlinked to multiple USB configurations.
> > > >
> > > > Because all configurations tied to the same function instance are
> > > > architecturally designed to share a single network device, and configfs
> > > > only exposes a single 'ifname' attribute per instance, this 1-to-many
> > > > bug cannot be safely patched.
> > > >
> > > > To restore the correct 1:1 mapping and resolve the userspace
> > > > regressions, this series reverts the changes in reverse order, returning
> > > > the net_device allocation back to the instance level (alloc_inst).
> > > >
> > > > THE NEW SOLUTION
> > > > ----------------
> > > > Use device_move() to reparent the net_device between the gadget device
> > > > tree and /sys/devices/virtual across bind/unbind cycles. On the last
> > > > unbind, device_move(NULL) moves the net_device to the virtual device
> > > > tree before the gadget device is destroyed. On rebind, device_move()
> > > > reparents it back under the new gadget, restoring proper sysfs topology
> > > > and power management ordering.
> > > >
> > > > The 1:1 mapping between function instance and net_device is maintained,
> > > > and configfs always reports the resolved interface name.
> > > >
> > > > A bind_count tracks how many configurations reference the function
> > > > instance, ensuring device_move fires only on the first bind.
> > > > __free(detach_gadget) ensures the net_device is moved back to virtual
> > > > if bind fails after a successful device_move, preventing dangling
> > > > sysfs on partial bind failure.
> > >
> > > Applying this series on v7.0-rc3 fixes the reported issues for me on
> > > Qualcomm-based Fairphone (Gen. 6). For v7.0-rc3 the first two commits
> > > need to be skipped, looks like the original commits are only in -next
> > > and not in v7.0-rc?
> > >
> > > Tested-by: Luca Weiss <luca.weiss@xxxxxxxxxxxxx> # milos-fairphone-fp6
> > >
> > > Thanks for fixing this!
> > >
> > > Regards
> > > Luca
> >
> > Thanks for testing.
> >
> > That is correct. The first two commits:
> >
> > - [Patch v2 1/7] Revert "usb: gadget: f_ncm: Fix atomic context locking issue"
> > - [Patch v2 2/7] Revert "usb: legacy: ncm: Fix NPE in gncm_bind"
> >
> > have not been merged into the mainline yet, so skipping them for your
> > test was the right move. This series is based on Greg's usb-linus
> > branch rather than the Linux's master branch.
>
> These should all now be in 7.0-rc4, right?
>

Right, I saw all these patchsets land in 7.0-rc4 [1].

Thanks for the review and for merging these.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/drivers/usb/gadget?h=v7.0-rc4

Regards,
Kuen-Han