Re: Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace
From: Yongji Xie
Date: Thu May 27 2021 - 23:55:02 EST
On Fri, May 28, 2021 at 9:33 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
>
>
> 在 2021/5/27 下午6:14, Yongji Xie 写道:
> > On Thu, May 27, 2021 at 4:43 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >>
> >> 在 2021/5/27 下午4:41, Jason Wang 写道:
> >>> 在 2021/5/27 下午3:34, Yongji Xie 写道:
> >>>> On Thu, May 27, 2021 at 1:40 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >>>>> 在 2021/5/27 下午1:08, Yongji Xie 写道:
> >>>>>> On Thu, May 27, 2021 at 1:00 PM Jason Wang <jasowang@xxxxxxxxxx>
> >>>>>> wrote:
> >>>>>>> 在 2021/5/27 下午12:57, Yongji Xie 写道:
> >>>>>>>> On Thu, May 27, 2021 at 12:13 PM Jason Wang <jasowang@xxxxxxxxxx>
> >>>>>>>> wrote:
> >>>>>>>>> 在 2021/5/17 下午5:55, Xie Yongji 写道:
> >>>>>>>>>> +
> >>>>>>>>>> +static int vduse_dev_msg_sync(struct vduse_dev *dev,
> >>>>>>>>>> + struct vduse_dev_msg *msg)
> >>>>>>>>>> +{
> >>>>>>>>>> + init_waitqueue_head(&msg->waitq);
> >>>>>>>>>> + spin_lock(&dev->msg_lock);
> >>>>>>>>>> + vduse_enqueue_msg(&dev->send_list, msg);
> >>>>>>>>>> + wake_up(&dev->waitq);
> >>>>>>>>>> + spin_unlock(&dev->msg_lock);
> >>>>>>>>>> + wait_event_killable(msg->waitq, msg->completed);
> >>>>>>>>> What happens if the userspace(malicous) doesn't give a response
> >>>>>>>>> forever?
> >>>>>>>>>
> >>>>>>>>> It looks like a DOS. If yes, we need to consider a way to fix that.
> >>>>>>>>>
> >>>>>>>> How about using wait_event_killable_timeout() instead?
> >>>>>>> Probably, and then we need choose a suitable timeout and more
> >>>>>>> important,
> >>>>>>> need to report the failure to virtio.
> >>>>>>>
> >>>>>> Makes sense to me. But it looks like some
> >>>>>> vdpa_config_ops/virtio_config_ops such as set_status() didn't have a
> >>>>>> return value. Now I add a WARN_ON() for the failure. Do you mean we
> >>>>>> need to add some change for virtio core to handle the failure?
> >>>>> Maybe, but I'm not sure how hard we can do that.
> >>>>>
> >>>> We need to change all virtio device drivers in this way.
> >>>
> >>> Probably.
> >>>
> >>>
> >>>>> We had NEEDS_RESET but it looks we don't implement it.
> >>>>>
> >>>> Could it handle the failure of get_feature() and get/set_config()?
> >>>
> >>> Looks not:
> >>>
> >>> "
> >>>
> >>> The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state
> >>> that a reset is needed. If DRIVER_OK is set, after it sets
> >>> DEVICE_NEEDS_RESET, the device MUST send a device configuration change
> >>> notification to the driver.
> >>>
> >>> "
> >>>
> >>> This looks implies that NEEDS_RESET may only work after device is
> >>> probed. But in the current design, even the reset() is not reliable.
> >>>
> >>>
> >>>>> Or a rough idea is that maybe need some relaxing to be coupled loosely
> >>>>> with userspace. E.g the device (control path) is implemented in the
> >>>>> kernel but the datapath is implemented in the userspace like TUN/TAP.
> >>>>>
> >>>> I think it can work for most cases. One problem is that the set_config
> >>>> might change the behavior of the data path at runtime, e.g.
> >>>> virtnet_set_mac_address() in the virtio-net driver and
> >>>> cache_type_store() in the virtio-blk driver. Not sure if this path is
> >>>> able to return before the datapath is aware of this change.
> >>>
> >>> Good point.
> >>>
> >>> But set_config() should be rare:
> >>>
> >>> E.g in the case of virtio-net with VERSION_1, config space is read
> >>> only, and it was set via control vq.
> >>>
> >>> For block, we can
> >>>
> >>> 1) start from without WCE or
> >>> 2) we add a config change notification to userspace or
> >>> 3) extend the spec to use vq instead of config space
> >>>
> >>> Thanks
> >>
> >> Another thing if we want to go this way:
> >>
> >> We need find a way to terminate the data path from the kernel side, to
> >> implement to reset semantic.
> >>
> > Do you mean terminate the data path in vdpa_reset().
>
>
> Yes.
>
>
> > Is it ok to just
> > notify userspace to stop data path asynchronously?
>
>
> For well-behaved userspace, yes but no for buggy or malicious ones.
>
But the buggy or malicious daemons can't do anything if my
understanding is correct.
> I had an idea, how about terminate IOTLB in this case? Then we're in
> fact turn datapath off.
>
Sorry, I didn't get your point here. What do you mean by terminating
IOTLB? Remove iotlb mapping? But userspace can still access the mapped
region.
Thanks,
Yongji