PROBLEM: Old content of /proc/net after switching network namespace
From: Mateusz StÄpieÅ
Date: Fri Jan 04 2019 - 03:21:35 EST
Hello everyone,
After changing network namespace using setns, the content of /proc/net
still represents the original namespace.
It looks like procfs dentries are not invalidated in dcache properly
after the namespace switch.
It happens only, when you read content of /proc/net before changing
namespace
The problem is reproducible in 4.19.13 but not in 4.14.X.
Bisecting the stable kernel tree shows that the commit
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1da4d377f943fe4194ffb9fb9c26cc58fad4dd24
introduced the problem.
Reverting mentioned commit resolves it.
MCVE (slightly modified example from [man 2 setns]):
#define _GNU_SOURCE
#include <fcntl.h>
#include <sched.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \
} while (0)
void print_dev()
{
int fd2;
fd2 = open("/proc/net/dev", O_RDONLY);
char buf[2048] = {0};
read(fd2, buf, 2048);
printf("%s", buf);
close(fd2);
}
int
main(int argc, char *argv[])
{
int fd;
printf("before namespace switch =========\n");
print_dev();
fd = open(argv[1], O_RDONLY); /* Get file descriptor for namespace */
if (fd == -1)
errExit("open");
if (setns(fd, 0) == -1) /* Join that namespace */
errExit("setns");
printf("after namespace switch ++++++++++\n");
print_dev();
return 0;
}
Steps to reproduce (assuming we have an interface named enp0s9):
ip netns add test
ip link set dev enp0s9 netns test
ip netns exec test sleep 30 &
gcc -o mcve mcve.c # mcve.c contains above C code
./mcve /proc/$(pidof sleep)/ns/net
# before namespace switch =========
# Inter-| Receive |
Transmit
# face |bytes packets errs drop fifo frame compressed
multicast|bytes packets errs drop fifo colls carrier compressed
# lo: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# br0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# enp0s3: 149625 1117 0 0 0 0 0 1
61664 485 0 0 0 0 0 0
# docker0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# enp0s8: 17086 60 0 0 0 0 0 29
1006 13 0 0 0 0 0 0
# after namespace switch ++++++++++
# Inter-| Receive |
Transmit
# face |bytes packets errs drop fifo frame compressed
multicast|bytes packets errs drop fifo colls carrier compressed
# lo: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# br0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# enp0s3: 150813 1135 0 0 0 0 0 1
64348 503 0 0 0 0 0 0
# docker0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# enp0s8: 17086 60 0 0 0 0 0 29
1006 13 0 0 0 0 0 0
ip netns exec test cat /proc/net/dev
## Should display
# Inter-| Receive |
Transmit
# face |bytes packets errs drop fifo frame compressed multicast|bytes
packets errs drop fifo colls carrier compressed
# lo: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# enp0s9: 10438 33 0 0 0 0 0 17
936 12 0 0 0 0 0 0
output from awk -f scripts/ver_linux
Linux test-agent 4.19.13 #1 SMP PREEMPT Thu Jan 3 12:03:20 UTC 2019
x86_64 GNU/Linux
Util-linux 2.29.2
Mount 2.29.2
Module-init-tools 23
E2fsprogs 1.43.4
Linux C Library 2.24
Dynamic linker (ldd) 2.24
Linux C++ Library 6.0.22
Procps 3.3.12
Net-tools 2.10
Sh-utils 8.26
Udev 232
Modules Loaded ahci ata_generic ata_piix crc32c_intel e1000
ehci_hcd ehci_pci i2c_core i2c_piix4 libahci serio_raw usb_common usbcore