Re: [PATCH] futex: futex_find_get_task make credentials check conditional

From: Darren Hart
Date: Tue Jun 29 2010 - 10:56:43 EST

On 06/29/2010 01:42 AM, Michal Hocko wrote:
On Mon 28-06-10 18:49:08, Peter Zijlstra wrote:
On Mon, 2010-06-28 at 18:39 +0200, Michal Hocko wrote:
Would something like the following be acceptable (just a compile
tested without comments). It simply makes caller of lookup_pi_state to
decide whether credentials should be checked.

So it was Ingo, who in c87e2837be8 (pi-futex:
futex_lock_pi/futex_unlock_pi support) introduced the euid checks:

+ if ((current->euid != p->euid)&& (current->euid != p->uid)) {
+ p = NULL;
+ goto out_unlock;
+ }

Ingo, do you remember the rationale behind that? It seems to be causing
grief when two different users contend on the same (shared) futex.

See the below proposed solution.

Here is the patch with comments and rationale:
(reference to the original discussion:

From f477a6d989dfde11c5bb5f28d5ce21d0682f4e25 Mon Sep 17 00:00:00 2001
From: Michal Hocko<mhocko@xxxxxxx>
Date: Tue, 29 Jun 2010 10:02:58 +0200
Subject: [PATCH] futex: futex_find_get_task make credentials check conditional

futex_find_get_task is currently used (through lookup_pi_state) from two
contexts, futex_requeue and futex_lock_pi_atomic. While credentials check
makes sense in the first code path, the second one is more problematic
because this check requires that the PI lock holder (pid parameter) has
the same uid and euid as the process's euid which is trying to lock the
same futex (current).

This results in glibc assert failure or process hang (if glibc is
compiled without assert support) for shared robust pthread mutex with
priority inheritance if a process tries to lock already held lock owned
by a process with a different euid:

pthread_mutex_lock.c:312: __pthread_mutex_lock_full: Assertion `(-(e)) != 3 || !robust' failed.

The problem is that futex_lock_pi_atomic which is called when we try to
lock already held lock checks the current holder (tid is stored in the
futex value) to get the PI state. It uses lookup_pi_state which in turn
gets task struct from futex_find_get_task. ESRCH is returned either when
the task is not found or if credentials check fails.
futex_lock_pi_atomic simply returns if it gets ESRCH. glibc code,
however, doesn't expect that robust lock returns with ESRCH because it
should get either success or owner died.

Let's make credentials check conditional (as a new parameter) in
futex_find_get_task. Then we can prevent from check in the pi lock path
and still preserve it in the futex_requeue path.

Hi Michal,

All the above is accurate, however I think it emphasizes glibc's expectations when the core of the issue is that shared PI futexes don't work across processes with different uid's.

It seems like most users of shared futexes do so from the same uid, however I can think of situations where it would be useful to use them from different uid's. Since shared futexes key on their physical address, their shouldn't be any security issues with allowing different uids.

Signed-off-by: Michal Hocko<mhocko@xxxxxxx>
kernel/futex.c | 24 +++++++++++++++---------
1 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index e7a35f1..79b69e5 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -425,8 +425,9 @@ static void free_pi_state(struct futex_pi_state *pi_state)
* Look up the task based on what TID userspace gave us.
* We dont trust it.
+ * Check the credentials if required by check_cred

While we're changing comment blocks, please update it to a proper kerneldoc function descriptor:

* futex_find_get_task() - Lookup task by TID
* @pid: TID of the task_struct to find
* @check_cred: check credentials (1) or not (0)
* Look up the task based on the TID userspace gave us. We don't trust
* it. Optionally check the credentials.
* Returns a valid task_struct pointer or an error code embedded in the
* pointer value.

The above should probably also include whatever motivation Ingo comes back with for having done the uid check in the first place - which I confess I am not seeing.

-static struct task_struct * futex_find_get_task(pid_t pid)
+static struct task_struct * futex_find_get_task(pid_t pid, bool check_cred)

bool is nice, not used elsewhere, but clearly defines purpose. I may need to update some of the other flags throughout the file in a follow-on patch.

struct task_struct *p;
const struct cred *cred = current_cred(), *pcred;
@@ -436,10 +437,12 @@ static struct task_struct * futex_find_get_task(pid_t pid)
if (!p) {
} else {
- pcred = __task_cred(p);
- if (cred->euid != pcred->euid&&
- cred->euid != pcred->uid)
- p = ERR_PTR(-ESRCH);
+ if (check_cred) {
+ pcred = __task_cred(p);
+ if (cred->euid != pcred->euid&&
+ cred->euid != pcred->uid)
+ p = ERR_PTR(-ESRCH);
+ }
@@ -504,9 +507,10 @@ void exit_pi_state_list(struct task_struct *curr)

+/* check_cred is just passed through to futex_find_get_task */
static int
lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
- union futex_key *key, struct futex_pi_state **ps)
+ union futex_key *key, struct futex_pi_state **ps, bool check_cred)

Wrap at 80.

struct futex_pi_state *pi_state = NULL;
struct futex_q *this, *next;
@@ -563,7 +567,7 @@ lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
if (!pid)
return -ESRCH;
- p = futex_find_get_task(pid);
+ p = futex_find_get_task(pid, check_cred);
if (IS_ERR(p))
return PTR_ERR(p);

@@ -704,8 +708,10 @@ retry:
* We dont have the lock. Look up the PI state (or create it if
* we are the first waiter):
+ * Do not ask for credentials check because we want to share the
+ * lock between processes with different (e)uids

Please merge the new comments into the old. Keeping the original colon confuses the comment block. Try:

* We dont have the lock. Look up the PI state (or create it if
* we are the first waiter). Don't ask for a credentials check
* as we need to allow shared locks between processes with
* different (e)uids.


Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at