Re: [PATCH v3] net: fully namespace net.core.{r,w}mem_{default,max} sysctls

From: Danny Lin
Date: Tue Mar 25 2025 - 08:40:12 EST


On Tue, Mar 25, 2025 at 4:39 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>
> On Fri, Mar 21, 2025 at 5:35 AM Danny Lin <danny@xxxxxxxxxxxx> wrote:
> >
> > This builds on commit 19249c0724f2 ("net: make net.core.{r,w}mem_{default,max} namespaced")
> > by adding support for writing the sysctls from within net namespaces,
> > rather than only reading the values that were set in init_net. These are
> > relatively commonly-used sysctls, so programs may try to set them without
> > knowing that they're in a container. It can be surprising for such attempts
> > to fail with EACCES.
> >
> > Unlike other net sysctls that were converted to namespaced ones, many
> > systems have a sysctl.conf (or other configs) that globally write to
> > net.core.rmem_default on boot and expect the value to propagate to
> > containers, and programs running in containers may depend on the increased
> > buffer sizes in order to work properly. This means that namespacing the
> > sysctls and using the kernel default values in each new netns would break
> > existing workloads.
> >
> > As a compromise, inherit the initial net.core.*mem_* values from the
> > current process' netns when creating a new netns. This is not standard
> > behavior for most netns sysctls, but it avoids breaking existing workloads.
> >
> > Signed-off-by: Danny Lin <danny@xxxxxxxxxxxx>
>
> Patch looks good, but see below:
>
> > diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
> > index c7769ee0d9c5..aedc249bf0e2 100644
> > --- a/net/core/sysctl_net_core.c
> > +++ b/net/core/sysctl_net_core.c
> > @@ -676,21 +676,9 @@ static struct ctl_table netns_core_table[] = {
> > .extra2 = SYSCTL_ONE,
> > .proc_handler = proc_dou8vec_minmax,
> > },
> > - {
> > - .procname = "tstamp_allow_data",
> > - .data = &init_net.core.sysctl_tstamp_allow_data,
> > - .maxlen = sizeof(u8),
> > - .mode = 0644,
> > - .proc_handler = proc_dou8vec_minmax,
> > - .extra1 = SYSCTL_ZERO,
> > - .extra2 = SYSCTL_ONE
> > - },
> > - /* sysctl_core_net_init() will set the values after this
> > - * to readonly in network namespaces
> > - */
>
> I think you have removed this sysctl :/

Fixed, sorry about that!

Best,
Danny
Founder @ OrbStack