-----Original Message-----You are right; the rndis header can be built as a separate fragment and sent.
From: Alexander Duyck [mailto:alexander.duyck@xxxxxxxxx]
Sent: Wednesday, September 16, 2015 2:39 PM
To: KY Srinivasan <kys@xxxxxxxxxxxxx>; David Laight
<David.Laight@xxxxxxxxxx>; Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>;
Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx
Cc: David S. Miller <davem@xxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
Jason Wang <jasowang@xxxxxxxxxx>
Subject: Re: [PATCH net-next RFC] net: increase LL_MAX_HEADER for Hyper-V
On 09/16/2015 10:55 AM, KY Srinivasan wrote:
Every packet-----Original Message-----Remote NDIS is the protocol used to send packets from the guest to the host.
From: David Laight [mailto:David.Laight@xxxxxxxxxx]
Sent: Wednesday, September 16, 2015 9:25 AM
To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; Vitaly Kuznetsov
<vkuznets@xxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx
Cc: David S. Miller <davem@xxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
KY Srinivasan <kys@xxxxxxxxxxxxx>; Jason Wang <jasowang@xxxxxxxxxx>
Subject: RE: [PATCH net-next RFC] net: increase LL_MAX_HEADER for Hyper-
V
From: Haiyang Zhang
Sent: 16 September 2015 17:09kernel@xxxxxxxxxxxxxxx;
-----Original Message-----
From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx]
Sent: Wednesday, September 16, 2015 11:50 AM
To: netdev@xxxxxxxxxxxxxxx
Cc: David S. Miller <davem@xxxxxxxxxxxxx>; linux-
VKY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang
<haiyangz@xxxxxxxxxxxxx>; Jason Wang <jasowang@xxxxxxxxxx>
Subject: [PATCH net-next RFC] net: increase LL_MAX_HEADER for Hyper-
EliminateCommit b08cc79155fc26d0d112b1470d1ece5034651a4b ("hv_netvsc:
path,memory allocation in the packet send path") introduced skb headroom
request for Hyper-V netvsc driver:
max_needed_headroom = sizeof(struct hv_netvsc_packet) +
sizeof(struct rndis_message) +
NDIS_VLAN_PPI_SIZE + NDIS_CSUM_PPI_SIZE +
NDIS_LSO_PPI_SIZE + NDIS_HASH_PPI_SIZE;
...
net->needed_headroom = max_needed_headroom;
max_needed_headroom is 220 bytes, it significantly exceeds the
LL_MAX_HEADER setting. This causes each skb to be cloned on send
LL_MAX_HEADERe.g. for IPv4 case we fall into the following clause
(ip_finish_output2()):
if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) {
...
skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev));
...
}
leading to a significant performance regression. Increase
HVNETVSC_MAX_HEADER.to make it suitable for netvsc, make it 224 to be 16-aligned.Thanks for the patch.
Alternatively we could (partially) revert the commit which introduced
skb
headroom request restoring manual memory allocation on transmit path.
Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
---
include/linux/netdevice.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 88a0069..7233790 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -132,7 +132,9 @@ static inline bool dev_xmit_complete(int rc)
* used.
*/
-#if defined(CONFIG_WLAN) || IS_ENABLED(CONFIG_AX25)
+#if IS_ENABLED(CONFIG_HYPERV_NET)
+# define LL_MAX_HEADER 224
+#elif defined(CONFIG_WLAN) || IS_ENABLED(CONFIG_AX25)
# if defined(CONFIG_MAC80211_MESH)
# define LL_MAX_HEADER 128
# else
To avoid we forget to update that 224 number when we add more things
into netvsc header, I suggest that we define a macro in netdevice.h such
as:
#define HVNETVSC_MAX_HEADER 224
#define LL_MAX_HEADER HVNETVSC_MAX_HEADER
And, put a note in netvsc code saying the header reservation shouldn't
exceed HVNETVSC_MAX_HEADER, or you need to update
Am I right in thinking this is adding an extra 96 unused bytes to the front
of almost all skb just so that hyper-v can make its link level header
contiguous with whatever follows (IP header ?).
Doesn't sound ideal.
needs to be decorated with the RNDIS header and the maximum room neededfor the RNDIS
header is the hreadroom we want.I think we get that. The question is does the Remote NDIS header and
packet info actually need to be a part of the header data? I would
argue that it probably doesn't.
So for example in netvsc_start_xmit it looks like you are calling
init_page_array in order to populate a set of page buffers, but the
first buffer for the Remote NDIS protocol is populated as a separate
page and offset. As such it doesn't seem like it necessarily needs to
be a part of the header data but could be maintained perhaps in a
separate ring buffer, or perhaps just be a separate page that you break
up to use for each header.
Indeed this is what we were doing earlier - on the outgoing path we would allocate
memory for the rndis header. My goal was to avoid this allocation on every packet being
sent and I decided to use the headroom instead. If we can completely avoid all memory
allocation for rndis header, it makes a significant perf difference:
Throughput as measured by iperf on a 40G interface (VM to VM on two nodes) in Gbps.
Scenario #A: LL_MAX_HEADER =128 [no change], needed_headroom = 220 [no change]
Scenario #B: LL_MAX_HEADER =224, needed_headroom = 220 [no change]
Conn# #A #B
1 6.9 8.2
2 13.2 14.9
4 17.6 16.6
8 24.1 26.9
16 24.0 31.5
32 24.5 33.6
64 31.6 31.5
128 29.6 30.3
Column A is the existing code where we end up having to allocate more headroom and column B is with
Vitaly's patch. I will experiment with a light-weight allocator for the rndis header.
Regards,
K. Y