Re: [RFC 0/2] vduse: add support for networking devices
From: Yongji Xie
Date: Sun Apr 23 2023 - 04:22:47 EST
On Sun, Apr 23, 2023 at 2:31 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
>
> On Fri, Apr 21, 2023 at 10:28 PM Maxime Coquelin
> <maxime.coquelin@xxxxxxxxxx> wrote:
> >
> >
> >
> > On 4/21/23 07:51, Jason Wang wrote:
> > > On Thu, Apr 20, 2023 at 10:16 PM Maxime Coquelin
> > > <maxime.coquelin@xxxxxxxxxx> wrote:
> > >>
> > >>
> > >>
> > >> On 4/20/23 06:34, Jason Wang wrote:
> > >>> On Wed, Apr 19, 2023 at 9:43 PM Maxime Coquelin
> > >>> <maxime.coquelin@xxxxxxxxxx> wrote:
> > >>>>
> > >>>> This small series enables virtio-net device type in VDUSE.
> > >>>> With it, basic operation have been tested, both with
> > >>>> virtio-vdpa and vhost-vdpa using DPDK Vhost library series
> > >>>> adding VDUSE support [0] using split rings layout.
> > >>>>
> > >>>> Control queue support (and so multiqueue) has also been
> > >>>> tested, but require a Kernel series from Jason Wang
> > >>>> relaxing control queue polling [1] to function reliably.
> > >>>>
> > >>>> Other than that, we have identified a few gaps:
> > >>>>
> > >>>> 1. Reconnection:
> > >>>> a. VDUSE_VQ_GET_INFO ioctl() returns always 0 for avail
> > >>>> index, even after the virtqueue has already been
> > >>>> processed. Is that expected? I have tried instead to
> > >>>> get the driver's avail index directly from the avail
> > >>>> ring, but it does not seem reliable as I sometimes get
> > >>>> "id %u is not a head!\n" warnings. Also such solution
> > >>>> would not be possible with packed ring, as we need to
> > >>>> know the wrap counters values.
> > >>>
> > >>> Looking at the codes, it only returns the value that is set via
> > >>> set_vq_state(). I think it is expected to be called before the
> > >>> datapath runs.
> > >>>
> > >>> So when bound to virtio-vdpa, it is expected to return 0. But we need
> > >>> to fix the packed virtqueue case, I wonder if we need to call
> > >>> set_vq_state() explicitly in virtio-vdpa before starting the device.
> > >>>
> > >>> When bound to vhost-vdpa, Qemu will call VHOST_SET_VRING_BASE which
> > >>> will end up a call to set_vq_state(). Unfortunately, it doesn't
> > >>> support packed ring which needs some extension.
> > >>>
> > >>>>
> > >>>> b. Missing IOCTLs: it would be handy to have new IOCTLs to
> > >>>> query Virtio device status,
> > >>>
> > >>> What's the use case of this ioctl? It looks to me userspace is
> > >>> notified on each status change now:
> > >>>
> > >>> static int vduse_dev_set_status(struct vduse_dev *dev, u8 status)
> > >>> {
> > >>> struct vduse_dev_msg msg = { 0 };
> > >>>
> > >>> msg.req.type = VDUSE_SET_STATUS;
> > >>> msg.req.s.status = status;
> > >>>
> > >>> return vduse_dev_msg_sync(dev, &msg);
> > >>> }
> > >>
> > >> The idea was to be able to query the status at reconnect time, and
> > >> neither having to assume its value nor having to store its value in a
> > >> file (the status could change while the VDUSE application is stopped,
> > >> but maybe it would receive the notification at reconnect).
> > >
> > > I see.
> > >
> > >>
> > >> I will prototype using a tmpfs file to save needed information, and see
> > >> if it works.
> > >
> > > It might work but then the API is not self contained. Maybe it's
> > > better to have a dedicated ioctl.
> > >
> > >>
> > >>>> and retrieve the config
> > >>>> space set at VDUSE_CREATE_DEV time.
> > >>>
> > >>> In order to be safe, VDUSE avoids writable config space. Otherwise
> > >>> drivers could block on config writing forever. That's why we don't do
> > >>> it now.
> > >>
> > >> The idea was not to make the config space writable, but just to be able
> > >> to fetch what was filled at VDUSE_CREATE_DEV time.
> > >>
> > >> With the tmpfs file, we can avoid doing that and just save the config
> > >> space there.
> > >
> > > Same as the case for status.
> >
> > I have cooked a DPDK patch to support reconnect with a tmpfs file as
> > suggested by Yongji:
> >
> > https://gitlab.com/mcoquelin/dpdk-next-virtio/-/commit/53913f2b1155b02c44d5d3d298aafd357e7a8c48
>
> This seems tricky, for example for status:
>
> dev->log->status = dev->status;
>
> What if we crash here?
>
The message will be re-sent by the kernel if it's not replied. But I
think it would be better if we can restore it via some ioctl.
Thanks,
Yongji