Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

From: Mike Galbraith
Date: Tue Apr 11 2017 - 00:20:43 EST


On Tue, 2017-04-11 at 00:23 +0300, Michael S. Tsirkin wrote:
> On Sat, Apr 08, 2017 at 07:01:34AM +0200, Mike Galbraith wrote:
> > On Fri, 2017-04-07 at 21:56 +0300, Michael S. Tsirkin wrote:
> >
> > > OK. test3 and test4 are now pushed: test3 should fix your hang,
> > > test4 is trying to fix a crash reported independently.
> >
> > test3 does not fix the post hibernate hang business that I can easily
> > reproduce, those are NFS, and at least as old as 4.4. Host/guest,
> > dunno, put 4.4 on both, guest hangs intermittently.
>
> OK so IIUC you agree it's a good idea to send test4 to Linus, right?

Well, my box agrees that that is a viable option.

> Hybernation's still broken but that's not a regression.

Yup.

> > [] __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
> > [] rpc_wait_bit_killable+0x1e/0xb0 [sunrpc]
> > [] __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
> > [] autoremove_wake_function+0x50/0x50
> > [] call_decode+0x850/0x850 [sunrpc]
> > [] call_decode+0x850/0x850 [sunrpc]
> > [] __rpc_execute+0x14e/0x440 [sunrpc]
> > [] ktime_get+0x35/0xa0
> > [] rpc_run_task+0x120/0x170 [sunrpc]
> > [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4]
> > [] _nfs4_proc_getattr+0xb0/0xc0 [nfsv4]
> > [] path_lookupat+0xd2/0x100
> > [] nfs4_proc_getattr+0x5c/0xe0 [nfsv4]
> > [] __nfs_revalidate_inode+0xa0/0x300 [nfs]
> > [] nfs_getattr+0x95/0x250 [nfs]
> > [] vfs_statx+0x7b/0xc0
> > [] SYSC_newstat+0x20/0x40
> > [] entry_SYSCALL_64_fastpath+0x1a/0xa9
> > [] 0xffffffffffffffff
> >
> > I noted no _other_ misbehavior in either kernel, w/wo threadirqs.
> >
> > > > -Mike
>
> Interesting. I would guess virtio net does not complete some
> packets. So you were unable to find an old guest where this
> works fine?

I just tried my opensuse 13.2 clone. It works markedly less fine,
turns into a brick either on the way down or back up in short order.

-Mike