RE: [PATCH v11 net-next 0/1] introduce Hyper-V VM Sockets(hv_sock)

From: Dexuan Cui
Date: Wed May 18 2016 - 20:59:20 EST


> From: devel [mailto:driverdev-devel-bounces@xxxxxxxxxxxxxxxxxxxxxx] On Behalf
> Of Dexuan Cui
> Sent: Tuesday, May 17, 2016 10:46
> To: David Miller <davem@xxxxxxxxxxxxx>
> Cc: olaf@xxxxxxxxx; gregkh@xxxxxxxxxxxxxxxxxxx; jasowang@xxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; joe@xxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx;
> apw@xxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; Haiyang Zhang
> <haiyangz@xxxxxxxxxxxxx>
> Subject: RE: [PATCH v11 net-next 0/1] introduce Hyper-V VM Sockets(hv_sock)
>
> > From: David Miller [mailto:davem@xxxxxxxxxxxxx]
> > Sent: Monday, May 16, 2016 1:16
> > To: Dexuan Cui <decui@xxxxxxxxxxxxx>
> > Cc: gregkh@xxxxxxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx;
> > apw@xxxxxxxxxxxxx; jasowang@xxxxxxxxxx; cavery@xxxxxxxxxx; KY
> > Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>;
> > joe@xxxxxxxxxxx; vkuznets@xxxxxxxxxx
> > Subject: Re: [PATCH v11 net-next 0/1] introduce Hyper-V VM Sockets(hv_sock)
> >
> > From: Dexuan Cui <decui@xxxxxxxxxxxxx>
> > Date: Sun, 15 May 2016 09:52:42 -0700
> >
> > > Changes since v10
> > >
> > > 1) add module params: send_ring_page, recv_ring_page. They can be used
> to
> > > enlarge the ringbuffer size to get better performance, e.g.,
> > > # modprobe hv_sock recv_ring_page=16 send_ring_page=16
> > > By default, recv_ring_page is 3 and send_ring_page is 2.
> > >
> > > 2) add module param max_socket_number (the default is 1024).
> > > A user can enlarge the number to create more than 1024 hv_sock sockets.
> > > By default, 1024 sockets take about 1024 * (3+2+1+1) * 4KB = 28M bytes.
> > > (Here 1+1 means 1 page for send/recv buffers per connection, respectively.)
> >
> > This is papering around my objections, and create module parameters which
> > I am fundamentally against.
> >
> > You're making the facility unusable by default, just to work around my
> > memory consumption concerns.
> >
> > What will end up happening is that everyone will simply increase the
> > values.
> >
> > You're not really addressing the core issue, and I will be ignoring you
> > future submissions of this change until you do.
>
> David,
> I am sorry I came across as ignoring your feedback; that was not my intention.
> The current host side design for this feature is such that each socket connection
> needs its own channel, which consists of
>
> 1. A ring buffer for host to guest communication
> 2. A ring buffer for guest to host communication
>
> The memory for the ring buffers has to be pinned down as this will be accessed
> both from interrupt level in Linux guest and from the host OS at any time.
>
> To address your concerns, I am planning to re-implement both the receive path
> and the send path so that no additional pinned memory will be needed.
>
> Receive Path:
> When the application does a read on the socket, we will dynamically allocate
> the buffer and perform the read operation on the incoming ring buffer. Since
> we will be in the process context, we can sleep here and will set the
> "GFP_KERNEL | __GFP_NOFAIL" flags. This buffer will be freed once the
> application consumes all the data.
>
> Send Path:
> On the send side, we will construct the payload to be sent directly on the
> outgoing ringbuffer.
>
> So, with these changes, the only memory that will be pinned down will be the
> memory for the ring buffers on a per-connection basis and this memory will be
> pinned down until the connection is torn down.
>
> Please let me know if this addresses your concerns.
>
> -- Dexuan

Hi David,
Ping. Really appreciate your comment.

Thanks,
-- Dexuan