Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

From: Michael S. Tsirkin
Date: Wed Mar 29 2017 - 16:19:48 EST


On Wed, Mar 29, 2017 at 08:23:22AM +0200, Mike Galbraith wrote:
> On Mon, 2017-03-27 at 20:18 +0200, Mike Galbraith wrote:
>
> > BTW, WRT RT woes with $subject, I tried booting a generic kernel with
> > threadirqs, and bingo, same deal, just a bit more painful than for RT,
> > where there's no watchdog moaning accompanying the (preemptible) spin.
>
> BTW++: the last hunk of this bandaid may be a bug fix. With only the
> first two, box tried to use uninitialized stuff on hibernate, went
> boom. Looks like that may be possible without help from me.
>
> --- a/drivers/char/virtio_console.c
> +++ b/drivers/char/virtio_console.c
> @@ -2058,7 +2058,7 @@ static int virtcons_probe(struct virtio_
> portdev->max_nr_ports = 1;
>
> /* Don't test MULTIPORT at all if we're rproc: not a valid feature! */
> - if (!is_rproc_serial(vdev) &&
> + if (!is_rproc_serial(vdev) && !IS_ENABLED(CONFIG_IRQ_FORCED_THREADING) &&
> virtio_cread_feature(vdev, VIRTIO_CONSOLE_F_MULTIPORT,
> struct virtio_console_config, max_nr_ports,
> &portdev->max_nr_ports) == 0) {
> @@ -2179,7 +2179,9 @@ static struct virtio_device_id id_table[
>
> static unsigned int features[] = {
> VIRTIO_CONSOLE_F_SIZE,
> +#ifndef CONFIG_IRQ_FORCED_THREADING
> VIRTIO_CONSOLE_F_MULTIPORT,
> +#endif
> };

These look kind of questionable.
Is this part needed?

> static struct virtio_device_id rproc_serial_id_table[] = {
> @@ -2202,14 +2204,16 @@ static int virtcons_freeze(struct virtio
>
> vdev->config->reset(vdev);
>
> - virtqueue_disable_cb(portdev->c_ivq);
> + if (use_multiport(portdev))
> + virtqueue_disable_cb(portdev->c_ivq);
> cancel_work_sync(&portdev->control_work);
> cancel_work_sync(&portdev->config_work);
> /*
> * Once more: if control_work_handler() was running, it would
> * enable the cb as the last step.
> */
> - virtqueue_disable_cb(portdev->c_ivq);
> + if (use_multiport(portdev))
> + virtqueue_disable_cb(portdev->c_ivq);
> remove_controlq_data(portdev);
>
> list_for_each_entry(port, &portdev->ports, list) {

This looks real. No idea why would interrupt sharing
trigger anything like this but go figure.
Can you pls submit this separately with
a signature?

--
MST