Re: [Security] [PATCH 2/2] taskstats: restrict access to user

From: Linus Torvalds
Date: Thu Jun 30 2011 - 12:50:06 EST


On Thu, Jun 30, 2011 at 3:59 AM, Balbir Singh <bsingharora@xxxxxxxxx> wrote:
> On Thu, Jun 30, 2011 at 1:27 PM, Vasiliy Kulikov <segoon@xxxxxxxxxxxx> wrote:
>> On Wed, Jun 29, 2011 at 13:09 -0700, Linus Torvalds wrote:
>>> Ok, having looked at this some more, I'm quite ready to just mark the
>>> whole TASKSTATS config option as BROKEN. It does seem to be a horrible
>>> security hazard, and very little seems to use it.
>>>
>
> I am not sure how you define BROKEN, BROKEN as per security rules?
> /proc/$pid/xxx also expose similar information.

No it doesn't.

/proc/$pid/xyz has always had proper security checks. Now, sometimes
they've been a bit too lax (iow, we've had cases where we just allowed
things we shouldn't - like this "io" file), but /proc/pid/ at least
*conceptually* has always had the "do I have permission to read this
or not". Do a "cat /proc/*/* > /dev/null" and you'll be getting a lot
of "permission denied" etc.

TASKSTATS? Nothing. Nada. Completely open.

That's just broken.

TASKSTATS also isn't apparently used by any normal program, and so
never got updated to namespaces. Again, /proc/xyz/ at least *tries*:
I'm not guaranteeing that it doesn't have some gaping holes, but I can
at least find the logic that saves and uses namespace information.

Again, TASKSTATS? Not so much.

> I must admit, I did not pay much attention to pid namespace changes as
> they were being made to taskstats. I'll look at the code later this
> week.

Some of the pid namespace issues could be fixed with the attached kind
of starting point.

But it's broken. See the comment I added. Why is it broken? Again -
taskstats is fundamentally broken, and doesn't seem to actually clean
up after itself. The "cleanup" depends on noticing at run-time that
some pid is stale and no longer listening. Again, that's just
fundamental brokenness in taskstats.

And it also only look sat pid namespaces. The network namespace issue
is separate.

So that's why I think it should be marked BROKEN. What applications
actually depend on this? iotop and what else? Because if it's just
iotop, I do suspect we might be better off telling people "ok,
disabling this will break iotop, but quite frankly, you're better off
without it".

Linus
kernel/taskstats.c | 16 ++++++++++++++--
1 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index fc0f22005417..620e89ae96cf 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -27,6 +27,7 @@
#include <linux/cgroup.h>
#include <linux/fs.h>
#include <linux/file.h>
+#include <linux/pid_namespace.h>
#include <net/genetlink.h>
#include <asm/atomic.h>

@@ -60,6 +61,7 @@ static const struct nla_policy cgroupstats_cmd_get_policy[CGROUPSTATS_CMD_ATTR_M
struct listener {
struct list_head list;
pid_t pid;
+ struct pid_namespace *pid_ns;
char valid;
};

@@ -127,6 +129,7 @@ static int send_reply(struct sk_buff *skb, struct genl_info *info)
static void send_cpu_listeners(struct sk_buff *skb,
struct listener_list *listeners)
{
+ struct pid_namespace *current_ns;
struct genlmsghdr *genlhdr = nlmsg_data(nlmsg_hdr(skb));
struct listener *s, *tmp;
struct sk_buff *skb_next, *skb_cur = skb;
@@ -140,8 +143,15 @@ static void send_cpu_listeners(struct sk_buff *skb,
}

rc = 0;
+ current_ns = task_active_pid_ns(current);
down_read(&listeners->sem);
list_for_each_entry(s, &listeners->list, list) {
+ /*
+ * BROKEN: what if "s" is stale? This way it will
+ * never be marked invalid!
+ */
+ if (s->pid_ns != current_ns)
+ continue;
skb_next = NULL;
if (!list_is_last(&s->list, &listeners->list)) {
skb_next = skb_clone(skb_cur, GFP_KERNEL);
@@ -287,6 +297,7 @@ static int add_del_listener(pid_t pid, const struct cpumask *mask, int isadd)
struct listener_list *listeners;
struct listener *s, *tmp, *s2;
unsigned int cpu;
+ struct pid_namespace *pid_ns = task_active_pid_ns(current);

if (!cpumask_subset(mask, cpu_possible_mask))
return -EINVAL;
@@ -300,13 +311,14 @@ static int add_del_listener(pid_t pid, const struct cpumask *mask, int isadd)
if (!s)
goto cleanup;
s->pid = pid;
+ s->pid_ns = pid_ns;
INIT_LIST_HEAD(&s->list);
s->valid = 1;

listeners = &per_cpu(listener_array, cpu);
down_write(&listeners->sem);
list_for_each_entry_safe(s2, tmp, &listeners->list, list) {
- if (s2->pid == pid)
+ if (s2->pid == pid && s2->pid_ns == pid_ns)
goto next_cpu;
}
list_add(&s->list, &listeners->list);
@@ -324,7 +336,7 @@ cleanup:
listeners = &per_cpu(listener_array, cpu);
down_write(&listeners->sem);
list_for_each_entry_safe(s, tmp, &listeners->list, list) {
- if (s->pid == pid) {
+ if (s->pid == pid && s->pid_ns == pid_ns) {
list_del(&s->list);
kfree(s);
break;