[PATCH net v2] net: Clarify the difference between hard_header_len and needed_headroom

From: Xie He
Date: Thu Sep 10 2020 - 01:43:48 EST


The difference between hard_header_len and needed_headroom has long been
confusing to driver developers. Let's clarify it.

The understanding on this issue in this patch is based on the following
reasons:

1.

In af_packet.c, the function packet_snd first reserves a headroom of
length (dev->hard_header_len + dev->needed_headroom).
Then if the socket is a SOCK_DGRAM socket, it calls dev_hard_header,
which calls dev->header_ops->create, to create the link layer header.
If the socket is a SOCK_RAW socket, it "un-reserves" a headroom of
length (dev->hard_header_len), and checks if the user has provided a
header of length (dev->hard_header_len) (in dev_validate_header).
This shows the developers of af_packet.c expect hard_header_len to
be consistent with header_ops.

2.

In af_packet.c, the function packet_sendmsg_spkt has a FIXME comment.
That comment states that prepending an LL header internally in a driver
is considered a bug. I believe this bug can be fixed by setting
hard_header_len to 0, making the internal header completely invisible
to af_packet.c (and requesting the headroom in needed_headroom instead).

3.

There is a commit for a WiFi driver:
commit 9454f7a895b8 ("mwifiex: set needed_headroom, not hard_header_len")
According to the discussion about it at:
https://patchwork.kernel.org/patch/11407493/
The author tried to set the WiFi driver's hard_header_len to the Ethernet
header length, and request additional header space internally needed by
setting needed_headroom. This means this usage is already adopted by
driver developers.

Cc: Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx>
Cc: Eric Dumazet <eric.dumazet@xxxxxxxxx>
Cc: Brian Norris <briannorris@xxxxxxxxxxxx>
Cc: Cong Wang <xiyou.wangcong@xxxxxxxxx>
Signed-off-by: Xie He <xie.he.0141@xxxxxxxxx>
---

Change from v1:
Small change to the commit message.

---
include/linux/netdevice.h | 4 ++--
net/packet/af_packet.c | 19 +++++++++++++------
2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 7bd4fcdd0738..3999b04e435d 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1691,8 +1691,8 @@ enum netdev_priv_flags {
* @min_mtu: Interface Minimum MTU value
* @max_mtu: Interface Maximum MTU value
* @type: Interface hardware type
- * @hard_header_len: Maximum hardware header length.
- * @min_header_len: Minimum hardware header length
+ * @hard_header_len: Maximum length of the headers created by header_ops
+ * @min_header_len: Minimum length of the headers created by header_ops
*
* @needed_headroom: Extra headroom the hardware may need, but not in all
* cases can this be guaranteed
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 2b33e977a905..0e324b08cb2e 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -93,12 +93,15 @@

/*
Assumptions:
- - if device has no dev->hard_header routine, it adds and removes ll header
- inside itself. In this case ll header is invisible outside of device,
- but higher levels still should reserve dev->hard_header_len.
- Some devices are enough clever to reallocate skb, when header
- will not fit to reserved space (tunnel), another ones are silly
- (PPP).
+ - If the device has no dev->header_ops, there is no LL header visible
+ above the device. In this case, its hard_header_len should be 0.
+ The device may prepend its own header internally. In this case, its
+ needed_headroom should be set to the space needed for it to add its
+ internal header.
+ For example, a WiFi driver pretending to be an Ethernet driver should
+ set its hard_header_len to be the Ethernet header length, and set its
+ needed_headroom to be (the real WiFi header length - the fake Ethernet
+ header length).
- packet socket receives packets with pulled ll header,
so that SOCK_RAW should push it back.

@@ -2937,10 +2940,14 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
skb_reset_network_header(skb);

err = -EINVAL;
+ if (!dev->header_ops)
+ WARN_ON_ONCE(dev->hard_header_len != 0);
if (sock->type == SOCK_DGRAM) {
offset = dev_hard_header(skb, dev, ntohs(proto), addr, NULL, len);
if (unlikely(offset < 0))
goto out_free;
+ WARN_ON_ONCE(offset > dev->hard_header_len);
+ WARN_ON_ONCE(offset < dev->min_header_len);
} else if (reserve) {
skb_reserve(skb, -reserve);
if (len < reserve + sizeof(struct ipv6hdr) &&
--
2.25.1