Re: RT scheduling and a way to make a process hang, unkillable

From: Dhaval Giani
Date: Mon Feb 16 2009 - 05:37:21 EST


On Sun, Feb 15, 2009 at 12:24:56PM +0100, Peter Zijlstra wrote:
> On Sat, 2009-02-14 at 16:51 -0800, Corey Hickey wrote:
> > Hello,
> >
> > I've encountered a bit of a problem in recent kernels that include
> > "Group scheduling for SCHED_RR/FIFO": it is possible for a process run
> > by root to hang itself and become unkillable--even by a 'kill -9'.
> >
> > The following kernel options must be set:
> > CONFIG_GROUP_SCHED=y
> > CONFIG_RT_GROUP_SCHED=y
> > CONFIG_USER_SCHED=y
> >
> > The procedure is for a program to:
> > 1. run as root
> > 2. set SCHED_FIFO
> > 3. change UID to a user with no realtime CPU share allocated
>
> Hmm, setuid() should fail in that situation.
>
> /me goes peek at code.
>
> Can't find any code to make that happen, Dhaval didn't we fix that at
> one point?

So after some searching around, I realized we did not. Does this help?
It fixes it on my system,

--
sched: Don't allow setuid to succeed if the user does not have rt bandwidth

Corey Hickey reported that on using setuid to change the uid of a
rt process, the process would be unkillable and not be running.
This is because there was no rt runtime for that user group. Add
in a check to see if a user can attach an rt task to its task group.

Disclaimer: Not sure about the return values, and if setuid allows
return values other than EPERM and EAGAIN.

Not-Yet-Signed-off-by: Dhaval Giani <dhaval@xxxxxxxxxxxxxxxxxx>

Index: linux-2.6/include/linux/sched.h
===================================================================
--- linux-2.6.orig/include/linux/sched.h
+++ linux-2.6/include/linux/sched.h
@@ -2320,9 +2320,12 @@ extern long sched_group_rt_runtime(struc
extern int sched_group_set_rt_period(struct task_group *tg,
long rt_period_us);
extern long sched_group_rt_period(struct task_group *tg);
+int sched_rt_can_attach(struct task_group *tg, struct task_struct *tsk);
#endif
#endif

+int rt_task_can_switch_user(uid_t uid, struct task_struct *tsk);
+
#ifdef CONFIG_TASK_XACCT
static inline void add_rchar(struct task_struct *tsk, ssize_t amt)
{
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -9466,6 +9466,16 @@ static int sched_rt_global_constraints(v

return ret;
}
+
+int sched_rt_can_attach(struct task_group *tg, struct task_struct *tsk)
+{
+ /* Don't accept realtime tasks when there is no way for them to run */
+ if (rt_task(tsk) && tg->rt_bandwidth.rt_runtime == 0)
+ return -EINVAL;
+
+ return 0;
+}
+
#else /* !CONFIG_RT_GROUP_SCHED */
static int sched_rt_global_constraints(void)
{
@@ -9559,8 +9569,7 @@ cpu_cgroup_can_attach(struct cgroup_subs
struct task_struct *tsk)
{
#ifdef CONFIG_RT_GROUP_SCHED
- /* Don't accept realtime tasks when there is no way for them to run */
- if (rt_task(tsk) && cgroup_tg(cgrp)->rt_bandwidth.rt_runtime == 0)
+ if (sched_rt_can_attach(cgroup_tg(cgrp), tsk))
return -EINVAL;
#else
/* We don't support RT-tasks being in separate groups */
Index: linux-2.6/kernel/user.c
===================================================================
--- linux-2.6.orig/kernel/user.c
+++ linux-2.6/kernel/user.c
@@ -216,8 +216,28 @@ static ssize_t cpu_rt_period_store(struc

static struct kobj_attribute cpu_rt_period_attr =
__ATTR(cpu_rt_period, 0644, cpu_rt_period_show, cpu_rt_period_store);
+
#endif

+#ifdef CONFIG_RT_GROUP_SCHED && CONFIG_USER_SCHED
+/*
+ * We need to check if a setuid can take place. This function should be called
+ * before successfully completing the setuid.
+ */
+
+int rt_task_can_switch_user(uid_t uid, struct task_struct *tsk)
+{
+ struct user_struct *up = find_user(uid);
+
+ return sched_rt_can_attach(up->tg, tsk);
+}
+#else
+int rt_task_can_switch_user(uid_t uid, struct task_struct *tsk)
+{
+ return 0;
+}
+
+#endif
/* default attributes per uid directory */
static struct attribute *uids_attributes[] = {
#ifdef CONFIG_FAIR_GROUP_SCHED
Index: linux-2.6/kernel/sys.c
===================================================================
--- linux-2.6.orig/kernel/sys.c
+++ linux-2.6/kernel/sys.c
@@ -579,6 +579,15 @@ static int set_user(struct cred *new)
return -EAGAIN;
}

+ /*
+ * Though rt_task_can_switch_user returns EINVAL on failure, we
+ * return EAGAIN so as not to break semantics and because
+ * EAGAIN implies resource not available which is the case in
+ * this case.
+ */
+ if (rt_task_can_switch_user(new->uid, current))
+ return -EAGAIN;
+
free_uid(new->user);
new->user = new_user;
return 0;

--
regards,
Dhaval
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/