Re: [syzbot] kernel panic: kernel stack overflow
From: Taehee Yoo
Date: Fri Oct 21 2022 - 07:08:28 EST
Hi,
2022. 10. 14. 오전 12:00에 Taehee Yoo 이(가) 쓴 글:
> Hi,
>
> On 10/12/22 21:19, Eric Dumazet wrote:
> > On Wed, Oct 12, 2022 at 12:53 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> wrote:
> >>
> >> On Wed, 12 Oct 2022 at 09:48, syzbot
> >> <syzbot+60748c96cf5c6df8e581@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >>>
> >>> Hello,
> >>>
> >>> syzbot found the following issue on:
> >>>
> >>> HEAD commit: bbed346d5a96 Merge branch 'for-next/core' into
> for-kernelci
> >>> git tree:
> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
for-kernelci
> >>> console output:
> https://syzkaller.appspot.com/x/log.txt?x=14a03a2a880000
> >>> kernel config:
> https://syzkaller.appspot.com/x/.config?x=aae2d21e7dd80684
> >>> dashboard link:
> https://syzkaller.appspot.com/bug?extid=60748c96cf5c6df8e581
> >>> compiler: Debian clang version
> 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld
> (GNU Binutils for Debian) 2.35.2
> >>> userspace arch: arm64
> >>>
> >>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>
> >>> Downloadable assets:
> >>> disk image:
>
https://storage.googleapis.com/syzbot-assets/11078f50b80b/disk-bbed346d.raw.xz
>
> >>> vmlinux:
>
https://storage.googleapis.com/syzbot-assets/398e5f1e6c84/vmlinux-bbed346d.xz
>
> >>>
> >>> IMPORTANT: if you fix the issue, please add the following tag to
> the commit:
> >>> Reported-by: syzbot+60748c96cf5c6df8e581@xxxxxxxxxxxxxxxxxxxxxxxxx
> >>
> >> +Jiri
> >>
> >> It looks like the issue is with the team device. It seems to call
> >> itself infinitely.
> >> team_device_event was mentioned in stack overflow bugs in the past:
> >>
>
https://groups.google.com/g/syzkaller-bugs/search?q=%22team_device_event%22
> >>
> >
> >
> > Taehee Yoo, can you take a look ?
> >
> > Patch series of yours was supposed to limit max nest level to 8
> >
> >
>
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=65921376425fc9c8b7ce647e1f7989f7cdf5dd70
>
> >
>
> I found a reproducer.
>
> #test_team.sh
> ip link add dummy0 type dummy
> ip link set dummy0 up
> for a1 in {0..1}
> do
> ip link add team$a1 type team
> for a2 in {0..1}
> do
> ip link add team$a1$a2 master team$a1 type team
> for a3 in {0..1}
> do
> ip link add team$a1$a2$a3 master team$a1$a2
> type team
> for a4 in {0..1}
> do
> ip link add team$a1$a2$a3$a4 master
> team$a1$a2$a3 type team
> for a5 in {0..1}
> do
> ip link add team$a1$a2$a3$a4$a5
> master team$a1$a2$a3$a4 type team
> for a6 in {0..1}
> do
> ip link add
> team$a1$a2$a3$a4$a5$a6 master team$a1$a2$a3$a4$a5 type team
> ip link add
> macvlan$a1$a2$a3$a4$a5$a6 link dummy0 master team$a1$a2$a3$a4$a5$a6 type
> macvlan
> ip link set
> macvlan$a1$a2$a3$a4$a5$a6 up
> ip link set
> team$a1$a2$a3$a4$a5$a6 up
> done
> ip link set
team$a1$a2$a3$a4$a5 up
> done
> ip link set team$a1$a2$a3$a4 up
> done
> ip link set team$a1$a2$a3 up
> done
> ip link set team$a1$a2 up
> done
> ip link set team$a1 up
> done
>
> #test_ethtool.sh
> for a1 in {0..1}
> do
> ethtool -K team$a1 lro $1
> for a2 in {0..1}
> do
> ethtool -K team$a1$a2 lro $1
> for a3 in {0..1}
> do
> ethtool -K team$a1$a2$a3 lro $1
> for a4 in {0..1}
> do
> ethtool -K team$a1$a2$a3$a4 lro $1
> for a5 in {0..1}
> do
> ethtool -K team$a1$a2$a3$a4$a5
> lro $1
> for a6 in {0..1}
> do
> ethtool -K
> team$a1$a2$a3$a4$a5$a6 lro $1
> ethtool -K
> macvlan$a1$a2$a3$a4$a5$a6 lro $1
> done
> done
> done
> done
> done
> done
>
> shell#1
> bash test_team.sh
> while :
> do
> bash test_ethtool.sh on
> done
> shell#2
> while :
> do
> bash test_ethtool.sh off
> done
>
> We can see a very similar call trace with the above reproducer.
> I think it is the same issue.
> Could you please test it?
>
> And, I found the fixed same issue too.
>
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.0&id=dd912306ff008891c82cd9f63e8181e47a9cb2fb
>
> https://groups.google.com/g/syzkaller-bugs/c/-5OV1OW-dS4/m/o2Oq6AYSAwAJ
>
I found the root cause of this issue.
This is simpler reproducer.
ip link add team0 type team
ethtool -K team0 lro on
for i in {1..100}
do
ip link add team$i master team0 type team
ethtool -K team$i lro on
done
ethtool -K team0 lro off
The above graph is like below:
team0
|
+------+------+-----+-----+
| | | | |
team1 team2 team3 ... team100
int __netdev_update_features(struct net_device *dev)
{
struct net_device *upper, *lower;
netdev_features_t features;
struct list_head *iter;
int err = -1;
...
sync_lower:
/* some features must be disabled on lower devices when disabled
* on an upper device (think: bonding master or bridge)
*/
netdev_for_each_lower_dev(dev, lower, iter)
netdev_sync_lower_features(dev, lower, features);
...
static void netdev_sync_lower_features(struct net_device *upper,
struct net_device *lower, netdev_features_t features)
{
netdev_features_t upper_disables = NETIF_F_UPPER_DISABLES;
netdev_features_t feature;
int feature_bit;
for_each_netdev_feature(upper_disables, feature_bit) {
feature = __NETIF_F_BIT(feature_bit);
if (!(features & feature) && (lower->features & feature)) {
netdev_dbg(upper, "Disabling feature %pNF on
lower dev %s.\n",
&feature, lower->name);
lower->wanted_features &= ~feature;
__netdev_update_features(lower);
if (unlikely(lower->features & feature))
netdev_WARN(upper, "failed to disable
%pNF on %s!\n",
&feature, lower->name);
else
netdev_features_change(lower);<-----HERE
}
}
}
void netdev_features_change(struct net_device *dev)
{
call_netdevice_notifiers(NETDEV_FEAT_CHANGE, dev);
}
The code looks like an iterator.
But it would work recursively because of notification.
When team0's feature(LRO) is changed with <ethtool -K team0 lro off>",
__netdev_update_features(team0) is called.
__netdev_update_features(team0) internally sends NETDEV_FEAT_CHANGE
event to all lower interfaces(team1, team2, ... team100).
team1 will receive NETDEV_FEAT_CHANGE, and it sends NETDEV_FEAT_CHANGE
to the upper interface(team0).
team0 will receive NETDEV_FEAT_CHANGE again, and it sends
NETDEV_FEAT_CHANGE to the all lower interfaces(team1, team2, ... team100).
(At this point, team1 flag was already set, so it will be skipped.)
team2 will receive NETDEV_FEAT_CHANGE, and it sends NETDEV_FEAT_CHANGE
to the upper interface(team0).
team0 will receive NETDEV_FEAT_CHANGE again again, and it sends
NETDEV_FEAT_CHANGE to the all lower interfaces(team1, team2, ... team100).
(team1, team2 skipped.)
...
So, if there are a few lower interfaces(roughly under 30 lower
interfaces), it anyway works even if internally works recursively.
But so many lower interfaces exist, stack overflow will occur.
This is the root cause of this issue.
I think synchronization direction should be one way.
Up or Down.
It means that if the team0 interface can send the NETDEV_FEAT_CHANGE
notification event to the lower interface,
the lower interfaces should be disallowed to send NETDEV_FEAT_CHANGE
event to the upper interface.
bonding has same issue.