Re: [PATCH v14 6/6] virtio-balloon: Add support for providing unused page reports to host
From: David Hildenbrand
Date: Mon Dec 02 2019 - 05:44:25 EST
[...]
>> Sounds like the device is unused :D
>>
>> "Device info for reporting unused pages" ?
>>
>> I am in general wondering, should we rename "unused" to "free". I.e.,
>> "free page reporting" instead of "unused page reporting"? Or what was
>> the motivation behind using "unused" ?
>
> I honestly don't remember why I chose "unused" at this point. I can
> switch over to "free" if that is what is preferred.
>
> Looking over the code a bit more I suspect the reason for avoiding it
> is because free page hinting also mentioned reporting in a few spots.
Maybe we should fix these cases. FWIW, I'd prefer "free page reporting".
(e.g., pairs nicely with "free page hinting").
>>> +
>>> + virtqueue_kick(vq);
>>> +
>>> + /* When host has read buffer, this completes via balloon_ack */
>>> + wait_event(vb->acked, virtqueue_get_buf(vq, &unused));
>>
>> Is it safe to rely on the same ack-ing mechanism as the inflate/deflate
>> queue? What if both mechanisms are used concurrently and race/both wait
>> for the hypervisor?
>>
>> Maybe we need a separate vb->acked + callback function.
>
> So if I understand correctly what is actually happening is that the
> wait event is simply a trigger that will wake us up, and at that point
> we check to see if the buffer we submitted is done. If not we go back
> to sleep. As such all we are really waiting on is the notification
> that the buffers we submitted have been processed. So it is using the
> same function but on a different virtual queue.
Very right, this is just a waitqueue (was only looking at this patch,
not the full code). This should indeed be fine.
>>> vb->inflate_vq = vqs[VIRTIO_BALLOON_VQ_INFLATE];
>>> vb->deflate_vq = vqs[VIRTIO_BALLOON_VQ_DEFLATE];
>>> if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ)) {
>>> @@ -932,12 +976,30 @@ static int virtballoon_probe(struct virtio_device *vdev)
>>> if (err)
>>> goto out_del_balloon_wq;
>>> }
>>> +
>>> + vb->pr_dev_info.report = virtballoon_unused_page_report;
>>> + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_REPORTING)) {
>>> + unsigned int capacity;
>>> +
>>> + capacity = min_t(unsigned int,
>>> + virtqueue_get_vring_size(vb->reporting_vq),
>>> + VIRTIO_BALLOON_VRING_HINTS_MAX);
>>> + vb->pr_dev_info.capacity = capacity;
>>> +
>>> + err = page_reporting_register(&vb->pr_dev_info);
>>> + if (err)
>>> + goto out_unregister_shrinker;
>>> + }
>>
>> It can happen here that we start reporting before marking the device
>> ready. Can that be problematic?
>>
>> Maybe we have to ignore any reports in virtballoon_unused_page_report()
>> until ready...
>
> I don't think there is an issue with us putting buffers on the ring
> before it is ready. I think it will just cause our function to sleep.
>
> I'm guessing that is the case since init_vqs will add a buffer to the
> stats vq and that happens even earlier in virtballoon_probe.
>
Interesting: "Note: vqs are enabled automatically after probe returns.".
Learned something new.
The virtballoon_changed(vdev) *after* virtio_device_ready(vdev) made me
wonder, because that could also fill the queues.
Maybe Michael can clarify.
>>> +
>>> virtio_device_ready(vdev);
>>>
>>> if (towards_target(vb))
>>> virtballoon_changed(vdev);
>>> return 0;
>>>
>>> +out_unregister_shrinker:
>>> + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
>>> + virtio_balloon_unregister_shrinker(vb);
>>
>> A sync is done implicitly, right? So after this call, we won't get any
>> new callbacks/are stuck in a callback.
>
> From what I can tell a read/write semaphore is used in
> unregister_shrinker when we delete it from the list so it shouldn't be
> an issue.
Yes, makes sense.
>
>>> out_del_balloon_wq:
>>> if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT))
>>> destroy_workqueue(vb->balloon_wq);
>>> @@ -966,6 +1028,8 @@ static void virtballoon_remove(struct virtio_device *vdev)
>>> {
>>> struct virtio_balloon *vb = vdev->priv;
>>>
>>> + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_REPORTING))
>>> + page_reporting_unregister(&vb->pr_dev_info);
>>
>> Dito, same question regarding syncs.
>
> Yes, although for that one I was using pointer deletion, a barrier,
> and a cancel_work_sync since I didn't support a list.
Okay, perfect.
[...]
>>
>> Small and powerful patch :)
>
> Agreed. Although we will have to see if we can keep it that way.
> Ideally I want to leave this with the ability so specify what size
> scatterlist we receive. However if we have to flip it around then it
> will force us to add logic for chopping up the scatterlist for
> processing in chunks.
I hope we can keep it like that. Otherwise each and every driver has to
implement this chopping-up (e.g., a hypervisor that can only send one
hint at a time - e.g., via a simple hypercall - would have to implement
that).
--
Thanks,
David / dhildenb