Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

From: Christian Borntraeger
Date: Thu Feb 13 2020 - 04:30:23 EST




On 12.02.20 17:34, Eugenio PÃrez wrote:
> On Tue, 2020-02-11 at 14:13 +0100, Christian Borntraeger wrote:
>>
>> On 11.02.20 14:04, Eugenio PÃrez wrote:
>>> On Mon, 2020-02-10 at 12:01 +0100, Christian Borntraeger wrote:
>>>> On 10.02.20 10:47, Eugenio Perez Martin wrote:
>>>>> Hi Christian.
>>>>>
>>>>> I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more
>>>>> data?
>>>>> Your configuration (libvirt or qemu line), and host's dmesg output if any?
>>>>>
>>>>> Thanks!
>>>>
>>>> If it was not obvious, this is on s390x, a big endian system.
>>>>
>>>
>>> Hi Christian. Thank you very much for your fast responses.
>>>
>>> Could you try this patch on top of eccb852f1fe6bede630e2e4f1a121a81e34354ab?
>>
>> I still get
>> [ 43.665145] Guest moved used index from 0 to 289
>> after some reboots.
>>
>>
>>> Thanks!
>>>
>>> From 71d0f9108a18aa894cc0c0c1c7efbad39f465a27 Mon Sep 17 00:00:00 2001
>>> From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= <
>>> eperezma@xxxxxxxxxx>
>>> Date: Tue, 11 Feb 2020 13:19:10 +0100
>>> Subject: [PATCH] vhost: fix return value of vhost_get_vq_desc
>>>
>>> Before of the batch change, it was the chain's head. Need to keep that
>>> way or we will not be able to free a chain of descriptors.
>>>
>>> Fixes: eccb852f1fe6 ("vhost: batching fetches")
>>> ---
>>> drivers/vhost/vhost.c | 3 +--
>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>>> index b5a51b1f2e79..fc422c3e5c08 100644
>>> --- a/drivers/vhost/vhost.c
>>> +++ b/drivers/vhost/vhost.c
>>> @@ -2409,12 +2409,11 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
>>> *out_num += ret;
>>> }
>>>
>>> - ret = desc->id;
>>> -
>>> if (!(desc->flags & VRING_DESC_F_NEXT))
>>> break;
>>> }
>>>
>>> + ret = vq->descs[vq->first_desc].id;
>>> vq->first_desc = i + 1;
>>>
>>> return ret;
>>>
>
> Sorry, still not able to reproduce the issue.
>
> Could we try to disable all the vhost features?
>
> Thanks!
>
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 661088ae6dc7..08f6d2ccb697 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -250,11 +250,11 @@ int vhost_init_device_iotlb(struct vhost_dev *d, bool enabled);
> } while (0)
>
> enum {
> - VHOST_FEATURES = (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) |
> - (1ULL << VIRTIO_RING_F_INDIRECT_DESC) |
> - (1ULL << VIRTIO_RING_F_EVENT_IDX) |
> - (1ULL << VHOST_F_LOG_ALL) |
> - (1ULL << VIRTIO_F_ANY_LAYOUT) |
> + VHOST_FEATURES = /* (1ULL << VIRTIO_F_NOTIFY_ON_EMPTY) | */
> + /* (1ULL << VIRTIO_RING_F_INDIRECT_DESC) | */
> + /* (1ULL << VIRTIO_RING_F_EVENT_IDX) | */
> + /* (1ULL << VHOST_F_LOG_ALL) | */
> + /* (1ULL << VIRTIO_F_ANY_LAYOUT) | */
> (1ULL << VIRTIO_F_VERSION_1)
> };
>


I still get guest crashes with this on top of eccb852f1fe6. (The patch did not
apply, I had to manually comment out these things)