Re: [virtio] 3618ad2a7c: kernel_BUG_at_drivers/net/virtio_net.c
From: Michael S. Tsirkin
Date: Sun Oct 18 2020 - 06:26:13 EST
On Sun, Oct 18, 2020 at 04:25:14PM +0800, kernel test robot wrote:
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-9):
>
> commit: 3618ad2a7c0e78e4258386394d5d5f92a3dbccf8 ("virtio-net: ethtool configurable RXCSUM")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
>
> in testcase: trinity
> version: trinity-i386
> with following parameters:
>
> runtime: 300s
>
> test-description: Trinity is a linux system call fuzz tester.
> test-url: http://codemonkey.org.uk/projects/trinity/
>
>
> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
> +----------------------------------------+------------+------------+
> | | c9bf52a173 | 3618ad2a7c |
> +----------------------------------------+------------+------------+
> | boot_successes | 18 | 0 |
> | boot_failures | 0 | 20 |
> | kernel_BUG_at_drivers/net/virtio_net.c | 0 | 20 |
> | invalid_opcode:#[##] | 0 | 20 |
> | EIP:virtnet_send_command | 0 | 20 |
> | Kernel_panic-not_syncing | 0 | 20 |
> +----------------------------------------+------------+------------+
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <lkp@xxxxxxxxx>
>
>
> [ 72.229171] kernel BUG at drivers/net/virtio_net.c:1667!
> [ 72.230266] invalid opcode: 0000 [#1] PREEMPT SMP
> [ 72.231172] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc8-02934-g3618ad2a7c0e7 #1
> [ 72.231172] EIP: virtnet_send_command+0x120/0x140
> [ 72.231172] Code: 00 0f 94 c0 8b 7d f0 65 33 3d 14 00 00 00 75 1c 8d 65 f4 5b 5e 5f 5d c3 66 90 be 01 00 00 00 e9 6e ff ff ff 8d b6 00 00 00 00 <0f> 0b e8 d9 bb 82 00 eb 17 8d b4 26 00 00 00 00 8d b4 26 00 00 00
> [ 72.231172] EAX: 0000000d EBX: f72895c0 ECX: 00000017 EDX: 00000011
> [ 72.231172] ESI: f7197800 EDI: ed69bd00 EBP: ed69bcf4 ESP: ed69bc98
> [ 72.231172] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010246
> [ 72.231172] CR0: 80050033 CR2: 00000000 CR3: 02c84000 CR4: 000406f0
> [ 72.231172] Call Trace:
> [ 72.231172] ? __virt_addr_valid+0x45/0x60
> [ 72.231172] ? ___cache_free+0x51f/0x760
> [ 72.231172] ? kobject_uevent_env+0xf4/0x560
> [ 72.231172] virtnet_set_guest_offloads+0x4d/0x80
> [ 72.231172] virtnet_set_features+0x85/0x120
> [ 72.231172] ? virtnet_set_guest_offloads+0x80/0x80
> [ 72.231172] __netdev_update_features+0x27a/0x8e0
> [ 72.231172] ? kobject_uevent+0xa/0x20
> [ 72.231172] ? netdev_register_kobject+0x12c/0x160
> [ 72.231172] register_netdevice+0x4fe/0x740
> [ 72.231172] register_netdev+0x1c/0x40
> [ 72.231172] virtnet_probe+0x728/0xb60
> [ 72.231172] ? _raw_spin_unlock+0x1d/0x40
> [ 72.231172] ? virtio_vdpa_get_status+0x1c/0x20
> [ 72.231172] virtio_dev_probe+0x1c6/0x271
> [ 72.231172] really_probe+0x195/0x2e0
> [ 72.231172] driver_probe_device+0x26/0x60
> [ 72.231172] device_driver_attach+0x49/0x60
> [ 72.231172] __driver_attach+0x46/0xc0
> [ 72.231172] ? device_driver_attach+0x60/0x60
> [ 72.231172] bus_for_each_dev+0x5d/0xa0
> [ 72.231172] driver_attach+0x19/0x20
> [ 72.231172] ? device_driver_attach+0x60/0x60
> [ 72.231172] bus_add_driver+0x197/0x1c0
> [ 72.231172] driver_register+0x66/0xc0
> [ 72.231172] register_virtio_driver+0x1b/0x40
> [ 72.231172] virtio_net_driver_init+0x61/0x86
> [ 72.231172] ? veth_init+0x14/0x14
> [ 72.231172] do_one_initcall+0x76/0x2e4
> [ 72.231172] ? rdinit_setup+0x2a/0x2a
> [ 72.231172] do_initcalls+0xb2/0xd5
> [ 72.231172] kernel_init_freeable+0x14f/0x179
> [ 72.231172] ? rest_init+0x100/0x100
> [ 72.231172] kernel_init+0xd/0xe0
> [ 72.231172] ret_from_fork+0x1c/0x30
> [ 72.231172] Modules linked in:
> [ 72.269563] ---[ end trace a6ebc4afea0e6cb1 ]---
>
>
> To reproduce:
>
> # build kernel
> cd linux
> cp config-5.9.0-rc8-02934-g3618ad2a7c0e7 .config
> make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
>
>
>
> Thanks,
> lkp
>
OK I more or less see what is going on.
virtnet_set_features now calls virtnet_set_guest_offloads
unconditionally, it used to only call it when there is something
to configure.
If device does not have a control vq, everything breaks.
Looking at this some more, I noticed that it's not really checking the
hardware too much. E.g.
if ((dev->features ^ features) & NETIF_F_LRO) {
if (features & NETIF_F_LRO)
offloads |= GUEST_OFFLOAD_LRO_MASK &
vi->guest_offloads_capable;
else
offloads &= ~GUEST_OFFLOAD_LRO_MASK;
}
and
#define GUEST_OFFLOAD_LRO_MASK ((1ULL << VIRTIO_NET_F_GUEST_TSO4) | \
(1ULL << VIRTIO_NET_F_GUEST_TSO6) | \
(1ULL << VIRTIO_NET_F_GUEST_ECN) | \
(1ULL << VIRTIO_NET_F_GUEST_UFO))
But there's no guarantee that e.g. VIRTIO_NET_F_GUEST_TSO6 is set.
If it isn't command should not send it.
And also
static int virtnet_set_features(struct net_device *dev,
netdev_features_t features)
{
struct virtnet_info *vi = netdev_priv(dev);
u64 offloads = vi->guest_offloads;
seems wrong since guest_offloads is zero initialized,
it does not reflect the state after reset which comes from
the features.
I suggest we revert 3618ad2a7c0e78 for now.
Let's work on something more robust wrt possible hardware features.
--
MST