Re: Regression: netlink fail (triggered by iw) removes extra wlan (phy) interface

From: Cong Wang
Date: Wed Mar 16 2016 - 12:01:16 EST


On Thu, Feb 25, 2016 at 5:22 AM, RafaÅ MiÅecki <zajec5@xxxxxxxxx> wrote:
> Hi,
>
> After updating kernel in OpenWrt from 4.1.6 to 4.1.10 I noticed that
> if "iw" command fails (which happens very rarely) my wlan0-1 interface
> disappears. To trigger this problem easily I'm using this trivial
> script:
> while [ 1 ]
> do
> iw phy phy0 interface add mon0 type monitor
> ifconfig mon0 up
> iw dev mon0 del
> done
>
> Whenever it goes wrong I see:
> Failed to connect to generic netlink.
> kern.info kernel: [ 1933.114338] br-lan: port 3(wlan0-1) entered disabled state
> kern.info kernel: [ 1933.335568] device wlan0-1 left promiscuous mode
> kern.info kernel: [ 1933.340385] br-lan: port 3(wlan0-1) entered disabled state
> daemon.notice netifd: Network device 'wlan0-1' link is down
> command failed: Too many open files in system (-23)
>

Note, for 4.1, the backport is known to be incorrect, and it is
fixed later by:

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/net/netlink/af_netlink.c?h=linux-4.1.y&id=a52ec6de6d1638e8c203d7188c55627f75371612


> This regression is caused by commit:
> 4e27762 netlink: Fix autobind race condition that leads to zero port ID
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=4e27762417669cb459971635be550eb7b5598286
> that is a backport of upstream:
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1f770c0a09da855a2b51af6d19de97fb955eca85
>
> This still happens with kernel 4.4.


Looks like the goto is missing in 4.4 branch too. ;) Mind to send a patch
to GregKH?


>
> My hardware is Linksys WRT160NL (Atheros AR9130 SoC) and I'm using two
> __ap interfaces on phy0 (wlan0 and wlan0-1).
>
> Could you take a look at this?
> Is there some additional info I could provide to help fixing this?
>
> --
> RafaÅ