Re: [PATCH] Selinux/hooks.c: Fix a NULL pointer dereference caused by semop()

From: ethan zhao
Date: Thu Jan 22 2015 - 21:38:55 EST


Davidlohr,

On 2015/1/23 4:48, Davidlohr Bueso wrote:
On Thu, 2015-01-22 at 14:05 -0500, Stephen Smalley wrote:
On Wed, Jan 21, 2015 at 9:44 PM, Ethan Zhao <ethan.kernel@xxxxxxxxx> wrote:
On Wed, Jan 21, 2015 at 1:30 PM, Manfred Spraul
<manfred@xxxxxxxxxxxxxxxx> wrote:
On 01/21/2015 04:53 AM, Ethan Zhao wrote:
On Tue, Jan 20, 2015 at 10:10 PM, Stephen Smalley <sds@xxxxxxxxxxxxx>
wrote:
On 01/20/2015 04:18 AM, Ethan Zhao wrote:
sys_semget()
->newary()
->security_sem_alloc()
->sem_alloc_security()
selinux_sem_alloc_security()
->ipc_alloc_security() {
->rc = avc_has_perm()
if (rc) {

ipc_free_security(&sma->sem_perm);
return rc;
We free the security structure here to avoid a memory leak on a
failed/denied semaphore set creation. In this situation, we return an
error to the caller (ultimately to newary), it does an
ipc_rcu_putref(sma, ipc_rcu_free), and it returns an error to the
caller. Thus, it never calls ipc_addid() and the semaphore set is not
created. So how then can you call semtimedop() on it?
Seems it wouldn't happen after commit
e8577d1f0329d4842e8302e289fb2c22156abef4 ?
That was my first guess when I read the bug report - but it can't be the
fix, because security_sem_alloc() is before the ipc_addid(), with or without
the patch.

thread A:
thread B:

semtimedop()
-> sem_obtain_object_check()
semctl(IPC_RMID)
-> freeary()
-> ipc_rcu_putref()
-> call_rcu()
-> somehow a grace period
-> sem_rcu_free()
-> security_sem_free()

Perhaps: modify ipc_free_security() to hexdump perm and a few more bytes if
the pointer is NULL?
I tried to ask for vmcore and do more analysis, basically, the race condition
still exists and open a hole to be DoS.
You said the patch was tested with 3.19-rc5. But did you reproduce
the bug on that kernel version before the patch? If not, what kernel
version were you running when you triggered the bug?
Also, is this a vanilla kernel? Or from a distro?
The hard thing, it is hit on customer's environment, the issue kernel doesn't
have many commits far from the last about IPC/SElinux.

Essentially, did the kernel with the reproducible bug have:
Not easy to do reproduce it is triggered by a process "opcmon" not public to everyone.
What I have is the panic log. the vmware not get yet.

Thanks,
Ethan
commit 53dad6d3a8e5ac1af8bacc6ac2134ae1a8b085f1
Author: Davidlohr Bueso <davidlohr@xxxxxx>
Date: Mon Sep 23 17:04:45 2013 -0700

ipc: fix race with LSMs

No, the kernel doesn't have this commit, will try it.
Currently, IPC mechanisms do security and auditing related checks under
RCU. However, since security modules can free the security structure,
for example, through selinux_[sem,msg_queue,shm]_free_security(), we can
race if the structure is freed before other tasks are done with it,
creating a use-after-free condition. Manfred illustrates this nicely,
for instance with shared mem and selinux:
-> do_shmat calls rcu_read_lock()
-> do_shmat calls shm_object_check().
Checks that the object is still valid - but doesn't acquire any locks.
Then it returns.
-> do_shmat calls security_shm_shmat (e.g. selinux_shm_shmat)
-> selinux_shm_shmat calls ipc_has_perm()
-> ipc_has_perm accesses ipc_perms->security
shm_close()
-> shm_close acquires rw_mutex & shm_lock
-> shm_close calls shm_destroy
-> shm_destroy calls security_shm_free (e.g. selinux_shm_free_security)
-> selinux_shm_free_security calls ipc_free_security(&shp->shm_perm)
-> ipc_free_security calls kfree(ipc_perms->security)
This patch delays the freeing of the security structures after all RCU
readers are done. Furthermore it aligns the security life cycle with
that of the rest of IPC - freeing them based on the reference counter.
For situations where we need not free security, the current behavior is
kept. Linus states:
"... the old behavior was suspect for another reason too: having the
security blob go away from under a user sounds like it could cause
various other problems anyway, so I think the old code was at least
_prone_ to bugs even if it didn't have catastrophic behavior."
I have tested this patch with IPC testcases from LTP on both my
quad-core laptop and on a 64 core NUMA server. In both cases selinux is
enabled, and tests pass for both voluntary and forced preemption models.
While the mentioned races are theoretical (at least no one as reported
them), I wanted to make sure that this new logic doesn't break anything
we weren't aware of.


Additionally, Manfred's concerns about the grace period are quite valid,
and it obviously assumes that the ->security nil dereference issue still
exists to some extent. Changes in RCU are something to consider as well.
This is all pretty iffy though, lets make sure we are looking at the
right kernel first.
Thanks,
Ethan

Thanks,
Davidlohr


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/