Hi,
On Tue, 2019-06-25 at 14:42 +0800, Zhiyuan Hou wrote:
In ipvlan l3s mode, ingress packet is switched to slave interface andThis looks weird to me: if I read the code correctly, the skb will
delivers to l4 stack. This may cause two problems:
1. When slave is in an ns different from master, the behavior of stack
in slave ns may cause confusion for users. For example, iptables, tc,
and other l2/l3 functions are not available for ingress packet.
2. l3s mode is not used for tap device, and cannot support ipvtap. But
in VM or container based VM cases, tap device is a very common device.
In l3s mode's input nf_hook, this patch calles the skb_forward_dev() to
forward ingress packet to slave and uses nf_conntrack_confirm() to make
conntrack work with new mode.
Signed-off-by: Zha Bin <zhabin@xxxxxxxxxxxxxxxxx>
Signed-off-by: Zhiyuan Hou <zhiyuan2048@xxxxxxxxxxxxxxxxx>
---
drivers/net/ipvlan/ipvlan.h | 9 ++++++++-
drivers/net/ipvlan/ipvlan_l3s.c | 16 ++++++++++++++--
2 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
index 3837c897832e..48c814e24c3f 100644
--- a/drivers/net/ipvlan/ipvlan.h
+++ b/drivers/net/ipvlan/ipvlan.h
@@ -172,6 +172,14 @@ void ipvlan_link_delete(struct net_device *dev, struct list_head *head);
void ipvlan_link_setup(struct net_device *dev);
int ipvlan_link_register(struct rtnl_link_ops *ops);
#ifdef CONFIG_IPVLAN_L3S
+
+#include <net/netfilter/nf_conntrack_core.h>
+
+static inline int ipvlan_confirm_conntrack(struct sk_buff *skb)
+{
+ return nf_conntrack_confirm(skb);
+}
+
int ipvlan_l3s_register(struct ipvl_port *port);
void ipvlan_l3s_unregister(struct ipvl_port *port);
void ipvlan_migrate_l3s_hook(struct net *oldnet, struct net *newnet);
@@ -206,5 +214,4 @@ static inline bool netif_is_ipvlan_port(const struct net_device *dev)
{
return rcu_access_pointer(dev->rx_handler) == ipvlan_handle_frame;
}
-
#endif /* __IPVLAN_H */
diff --git a/drivers/net/ipvlan/ipvlan_l3s.c b/drivers/net/ipvlan/ipvlan_l3s.c
index 943d26cbf39f..ed210002f593 100644
--- a/drivers/net/ipvlan/ipvlan_l3s.c
+++ b/drivers/net/ipvlan/ipvlan_l3s.c
@@ -95,14 +95,26 @@ static unsigned int ipvlan_nf_input(void *priv, struct sk_buff *skb,
{
struct ipvl_addr *addr;
unsigned int len;
+ int ret = NF_ACCEPT;
+ bool success;
addr = ipvlan_skb_to_addr(skb, skb->dev);
if (!addr)
goto out;
- skb->dev = addr->master->dev;
len = skb->len + ETH_HLEN;
- ipvlan_count_rx(addr->master, len, true, false);
+
+ ret = ipvlan_confirm_conntrack(skb);
+ if (ret != NF_ACCEPT) {
+ ipvlan_count_rx(addr->master, len, false, false);
+ goto out;
+ }
+
+ skb_push_rcsum(skb, ETH_HLEN);
+ success = dev_forward_skb(addr->master->dev, skb) == NET_RX_SUCCESS;
traverse twice NF_INET_LOCAL_IN, once due to the l3s hooking and
another one due to dev_forward_skb().
Also, tc ingreess, etc will run after the first traversing of
NF_INET_LOCAL_IN.
All in all I think that if full l2 processing is required, a different
mode or a different virtual device should be used.
Cheers,
Paolo