Re: [PATCH v1 0/2] Virtio: fix some vq allocation issues

From: Halil Pasic
Date: Sun Dec 30 2018 - 01:10:27 EST

On Sat, 29 Dec 2018 02:45:49 +0000
"Wang, Wei W" <> wrote:

> On Friday, December 28, 2018 3:57 PM, Christian Borntraeger wrote:
> > On 28.12.2018 03:26, Wei Wang wrote:
> > > Some vqs don't need to be allocated when the related feature bits are
> > > disabled. Callers notice the vq allocation layer by setting the
> > > related names[i] to be NULL.
> > >
> > > This patch series fixes the find_vqs implementations to handle this case.
> >
> > So the random crashes during boot are gone.
> > What still does not work is actually using the balloon.
> >
> > So in the qemu monitor using lets say "balloon 1000" will hang the guest.
> > Seems to be a deadlock in the virtio-ccw code. We seem to call the config
> > code in the interrupt handler.
> Yes. It reads a config register from the interrupt handler. Do you know why ccw doesn't support it and has some internal lock that caused the deadlock issue?
> Best,
> Wei

I guess you are the first one trying to read virtio config from within
interrupt context. AFAICT this never worked.

About what happens. The apidoc of ccw_device_start() says it needs to be
called with the ccw device lock held, so ccw_io_helper() tries to take
it (since forever I guess). OTOH do_cio_interrupt() takes the subchannel
lock and io_subchannel_initialize_dev() makes the ccw device lock be the
subchannel lock. That means when one tries to get virtio config form
within a cio interrupt context we deadlock, because we try to take a lock
we already have.

That said, I don't think this limitation is by design (i.e. intended).
Maybe Connie can help us with that question. AFAIK we have nothing
documented regarding this (neither that can nor can't).

Obviously, there are multiple ways around this problem, and at the
moment I can't tell which would be my preferred one.