Re: Hang in 9p/virtio

From: Cornelia Huck
Date: Tue Aug 02 2016 - 05:18:50 EST


On Sat, 30 Jul 2016 23:42:18 +0200
Vegard Nossum <vegard.nossum@xxxxxxxxxx> wrote:

> Hi,
>
> With fault injection triggering an allocation failure for the
> alloc_indirect() call in virtqueue_add() I'm seeing a hang in
> p9_virtio_zc_request() -- it seems to be waiting here indefinitely
> (i.e. at least 120 seconds):
>
> err = wait_event_interruptible(*req->wq,
> req->status >= REQ_STATUS_RCVD);
>
> Maybe somebody who is already familiar with the could would have a look?
>
> Stack trace for the memory allocation failure:
>
> CPU: 2 PID: 3877 Comm: trinity-c2 Not tainted 4.7.0+ #70
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> ffffffff84354a78 ffff88010594f2e8 ffffffff81d72f91 ffffffff84354a60
> 1ffff10020b29e62 ffff88010594f398 ffffffff81e07df7 00007faad2003fff
> 0000000000000064 ffffffffffffffff 0000000041b58ab3 ffffffff840a481c
> Call Trace:
> [...]
> [<ffffffff81473886>] __kmalloc+0x66/0x2e0
> [<ffffffff81f7c6b4>] alloc_indirect.isra.8+0x24/0xa0
> [<ffffffff81f7d37f>] virtqueue_add_sgs+0x41f/0xc90
> [<ffffffff836eb281>] p9_virtio_zc_request+0x531/0xdb0
> [<ffffffff836d6ecf>] p9_client_zc_rpc.constprop.14+0x23f/0xe80
> [<ffffffff836db77c>] p9_client_read+0x4bc/0x8d0
> [<ffffffff8193f0a3>] v9fs_file_read_iter+0xd3/0x190
> [<ffffffff814b4b62>] do_iter_readv_writev+0x212/0x490
> [<ffffffff814b6be9>] do_readv_writev+0x359/0x660
> [<ffffffff814babc7>] vfs_readv+0x67/0xa0
> [<ffffffff814bacd8>] do_readv+0xd8/0x270
>
> Stack trace for the stuck call:
>
> NMI backtrace for cpu 2
> CPU: 2 PID: 3877 Comm: trinity-c2 Not tainted 4.7.0+ #70
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> task: ffff8801174f5b00 task.stack: ffff880105948000
> RIP: 0010:[<ffffffff810b02a0>] [<ffffffff810b02a0>]
> __default_send_IPI_dest_field+0xe0/0x130
> Call Trace:
> [...]
> [<ffffffff811d584e>] prepare_to_wait_event+0x19e/0x410
> [<ffffffff836eb790>] p9_virtio_zc_request+0xa40/0xdb0
> [<ffffffff836d6ecf>] p9_client_zc_rpc.constprop.14+0x23f/0xe80
> [<ffffffff836db77c>] p9_client_read+0x4bc/0x8d0
> [<ffffffff8193f0a3>] v9fs_file_read_iter+0xd3/0x190
> [<ffffffff814b4b62>] do_iter_readv_writev+0x212/0x490
> [<ffffffff814b6be9>] do_readv_writev+0x359/0x660
> [<ffffffff814babc7>] vfs_readv+0x67/0xa0
> [<ffffffff814bacd8>] do_readv+0xd8/0x270

What happens is that the code falls back to direct virtio addressing
(after indirect addressing failed) - and this should work.

I'm more inclined to suspect a qemu instead of a kernel bug, as your
qemu version is quite old and there have been fixes in the virtio
buffer handling and virtio-9p in the meantime. (I'm suspecting
"virtio-9p: fix any_layout".)

Could you retry with a more recent qemu (at least version 2.4)?