[PATCH 4/9] User namespace: don't allow sysctl in non-init user ns (v2)

From: Serge Hallyn
Date: Tue Oct 18 2011 - 17:55:31 EST


From: Serge Hallyn <serge.hallyn@xxxxxxxxxxxxx>

sysctl.c has its own custom uid check, which is not user namespace
aware. As discovered by Richard, that allows root in a container
privileged access to set all sysctls.

To fix that, don't compare uid or groups if current is not in the
initial user namespace. We may at some point want to relax that check
so that some sysctls are allowed - for instance dmesg_restrict when
syslog is containerized.

Changelog:
Sep 22: As Miquel van Smoorenburg pointed out, rather than always
refusing access if not in initial user_ns, we should allow
world access rights to sysctl files. We just want to prevent
a task in a non-init user namespace from getting the root user
or group access rights.

Signed-off-by: Serge Hallyn <serge.hallyn@xxxxxxxxxxxxx>
Cc: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
Cc: Vasiliy Kulikov <segoon@xxxxxxxxxxxx>
Cc: richard@xxxxxx
Cc: Miquel van Smoorenburg <mikevs@xxxxxxxxxx>
---
kernel/sysctl.c | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 11d65b5..95988dc 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1697,10 +1697,12 @@ void register_sysctl_root(struct ctl_table_root *root)

static int test_perm(int mode, int op)
{
- if (!current_euid())
- mode >>= 6;
- else if (in_egroup_p(0))
- mode >>= 3;
+ if (current_user_ns() == &init_user_ns) {
+ if (!current_euid())
+ mode >>= 6;
+ else if (in_egroup_p(0))
+ mode >>= 3;
+ }
if ((op & ~mode & (MAY_READ|MAY_WRITE|MAY_EXEC)) == 0)
return 0;
return -EACCES;
--
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/