[PATCH net-next v5 0/4] netns: allow to identify peer netns
From: Nicolas Dichtel
Date: Thu Jan 15 2015 - 09:18:35 EST
The goal of this serie is to be able to multicast netlink messages with an
attribute that identify a peer netns.
This is needed by the userland to interpret some information contained in
netlink messages (like IFLA_LINK value, but also some other attributes in case
of x-netns netdevice (see also
http://thread.gmane.org/gmane.linux.network/315933/focus=316064 and
http://thread.gmane.org/gmane.linux.kernel.containers/28301/focus=4239)).
Ids of peer netns can be set by userland via a new rtnl cmd RTM_NEWNSID. When
the kernel needs an id for a peer (for example when advertising a new x-netns
interface via netlink), if the user didn't allocate an id, one will be
automatically allocated.
These ids are stored per netns and are local (ie only valid in the netns where
they are set). To avoid allocating an int for each peer netns, I use
idr_for_each() to retrieve the id of a peer netns. Note that it will be possible
to add a table (struct net -> id) later to optimize this lookup if needed.
Patch 1/4 introduces the rtnetlink API mechanism to set and get these ids.
Patch 2/4 and 3/4 implements an example of how to use these ids when advertising
information about a x-netns interface.
And patch 4/4 shows that the netlink messages can be symetric between a GET and
a SET.
iproute2 patches are available, I can send them on demand.
Here is a small screenshot to show how it can be used by userland.
# Initialization:
$ ip netns add foo
$ ip netns del foo
$ ip netns
$ touch /var/run/netns/init_net
$ mount --bind /proc/1/ns/net /var/run/netns/init_net
$ ip netns add foo
$ ip -n foo netns
foo
init_net
$ ip -n foo netns set init_net 0
$ ip -n foo netns set foo 1
# Only netns seen from foo have an id:
$ ip netns
foo
init_net
$ ip -n foo netns
foo (id: 1)
init_net (id: 0)
# Add a 4in4 x-netns interface with a link-netnsid option and check the dump:
$ ip -n foo link add ipip1 link-netnsid 0 type ipip remote 10.16.0.121 local 10.16.0.249
$ ip -n foo link ls ipip1
6: ipip1@NONE: <POINTOPOINT,NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default
link/ipip 10.16.0.249 peer 10.16.0.121 link-netnsid 0
# The parameter link-netnsid shows us where the interface sends and receives
# packets (and thus we know where encapsulated addresses are set).
# Add a 4in4 x-netns interface without a link-netnsid option and check that an
# id is allocated in init_net for foo
$ ip netns
foo
init_net
$ ip -n foo link add ipip2 type ipip remote 10.16.0.121 local 10.16.0.249
$ ip -n foo link set ipip2 netns init_net
$ ip link ls ipip2
7: ipip2@NONE: <POINTOPOINT,NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default
link/ipip 10.16.0.249 peer 10.16.0.121 link-netnsid 0
$ ip netns
foo (id: 0)
init_net
v4 -> v5:
use rtnetlink instead of genetlink
allocate automatically an id if user didn't assign one
rename include/uapi/linux/netns.h to include/uapi/linux/net_namespace.h
add vxlan in patch #3
RFCv3 -> v4:
rebase on net-next
add copyright text in the new netns.h file
RFCv2 -> RFCv3:
ids are now defined by userland (via netlink). Ids are stored in each netns
(and they are local to this netns).
add get_link_net support for ip6 tunnels
netnsid is now a s32 instead of a u32
RFCv1 -> RFCv2:
remove useless ()
ids are now stored in the user ns. It's possible to get an id for a peer netns
only if the current netns and the peer netns have the same user ns parent.
MAINTAINERS | 1 +
drivers/net/vxlan.c | 8 ++
include/net/ip6_tunnel.h | 1 +
include/net/ip_tunnels.h | 1 +
include/net/net_namespace.h | 4 +
include/net/rtnetlink.h | 2 +
include/uapi/linux/Kbuild | 1 +
include/uapi/linux/if_link.h | 1 +
include/uapi/linux/net_namespace.h | 23 ++++
include/uapi/linux/rtnetlink.h | 5 +
net/core/net_namespace.c | 210 +++++++++++++++++++++++++++++++++++++
net/core/rtnetlink.c | 38 ++++++-
net/ipv4/ip_gre.c | 2 +
net/ipv4/ip_tunnel.c | 8 ++
net/ipv4/ip_vti.c | 1 +
net/ipv4/ipip.c | 1 +
net/ipv6/ip6_gre.c | 1 +
net/ipv6/ip6_tunnel.c | 9 ++
net/ipv6/ip6_vti.c | 1 +
net/ipv6/sit.c | 1 +
20 files changed, 316 insertions(+), 3 deletions(-)
Comments are welcome.
Regards,
Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/