[PATCH] oom: use euid instead of CAP_SYS_ADMIN for protection root process

From: KOSAKI Motohiro
Date: Tue May 31 2011 - 05:28:30 EST


Recently, many userland daemon prefer to use libcap-ng and drop
all privilege just after startup. Because of (1) Almost privilege
are necessary only when special file open, and aren't necessary
read and write. (2) In general, privilege dropping brings better
protection from exploit when bugs are found in the daemon.

But, it makes suboptimal oom-killer behavior. CAI Qian reported
oom killer killed some important daemon at first on his fedora
like distro. Because they've lost CAP_SYS_ADMIN.

Of course, we recommend to drop privileges as far as possible
instead of keeping them. Thus, oom killer don't have to check
any capability. It implicitly suggest wrong programming style.

This patch change root process check way from CAP_SYS_ADMIN to
just euid==0.

Reported-by: CAI Qian <caiqian@xxxxxxxxxx>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
---
mm/oom_kill.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 59eda6e..4e1e8a5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -203,7 +203,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *mem,
* Root processes get 3% bonus, just like the __vm_enough_memory()
* implementation used by LSMs.
*/
- if (protect_root && has_capability_noaudit(p, CAP_SYS_ADMIN)) {
+ if (protect_root && (task_euid(p) == 0)) {
if (points >= totalpages / 32)
points -= totalpages / 32;
else
@@ -429,7 +429,7 @@ static void dump_tasks(const struct mem_cgroup *mem, const nodemask_t *nodemask)
struct task_struct *p;
struct task_struct *task;

- pr_info("[ pid] ppid uid cap total_vm rss swap score_adj name\n");
+ pr_info("[ pid] ppid uid euid total_vm rss swap score_adj name\n");
for_each_process(p) {
if (oom_unkillable_task(p, mem, nodemask))
continue;
@@ -444,9 +444,9 @@ static void dump_tasks(const struct mem_cgroup *mem, const nodemask_t *nodemask)
continue;
}

- pr_info("[%6d] %6d %5d %3d %8lu %8lu %8lu %9d %s\n",
+ pr_info("[%6d] %6d %5d %5d %8lu %8lu %8lu %9d %s\n",
task_tgid_nr(task), task_tgid_nr(task->real_parent),
- task_uid(task), has_capability_noaudit(task, CAP_SYS_ADMIN),
+ task_uid(task), task_euid(task),
task->mm->total_vm,
get_mm_rss(task->mm) + task->mm->nr_ptes,
get_mm_counter(task->mm, MM_SWAPENTS),
--
1.7.3.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/