Re: linux-next 20170519 - semaphores broken

From: Kees Cook
Date: Tue May 23 2017 - 16:27:38 EST


On Sun, May 21, 2017 at 11:12 AM, Manfred Spraul
<manfred@xxxxxxxxxxxxxxxx> wrote:
> Hi Kees,
>
>
> On 05/21/2017 07:20 AM, Kees Cook wrote:
>>
>> On Sat, May 20, 2017 at 1:18 PM, <valdis.kletnieks@xxxxxx> wrote:
>>>
>>> Seeing problems with programs that use semaphores. The one
>>> that I'm getting bit by is jackd. strace says:
>>>
>>> getuid() = 967
>>> semget(0x282929, 0, 000) = 229376
>>> semop(229376, [{0, -1, SEM_UNDO}], 1) = -1 EIDRM (Identifier removed)
>>> write(2, "JACK semaphore error: semop (Ide"..., 49JACK semaphore error:
>>> semop (Identifier removed)
>>> ) = 49
>>>
>>> Bisects down to this commit, and reverting it from 20170519 makes things
>>> work
>>> again. No idea why this causes indigestion, there's probably something
>>> subtly
>>> wrong here....
>>>
>>> commit 337f43326737b5eb28eb13f43c27a5788da0f913
>>> Author: Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
>>> Date: Fri May 19 07:39:23 2017 +1000
>>>
>>> ipc: merge ipc_rcu and kern_ipc_perm
>>>
>>> ipc has two management structures that exist for every id:
>>> - struct kern_ipc_perm, it contains e.g. the permissions.
>>> - struct ipc_rcu, it contains the rcu head for rcu handling and
>>> the refcount.
>>
>> I think I found the cause of this. Prior to this change, the RCU (with
>> refcount) is located ahead of the struct sem_array. After this change,
>> the RCU and refcount is within it, so this is happening:
>>
>> sma = container_of(ipc_rcu_alloc(size), struct sem_array,
>> sem_perm);
>> if (!sma)
>> return -ENOMEM;
>>
>> memset(sma, 0, size);
>>
>> ipc_rcu_alloc() initializes the refcount to 1, and the memset bumps it
>> back to zero.
>
> Correct.
>
>> A work-around would be to wrap the memset() like this:
>>
>> struct ipc_kern_perm perm;
>> ...
>> perm = sma->sem_perm;
>> memset(sma, 0, size);
>> sma->sem_perm = perm;
>
> No!
> The quick workaround would be to move the memset into ipc_rcu_alloc().

Right, right. I meant if Valdis wanted to test, this would isolate the problem.

>> I actually have a series that changes things much more, and moves the
>> refcount set to ipc_addid() which is the only place it needs to happen
>> (though this requires fixing up the mistaken rcu freeing on error
>> paths). Here's the lightly tested series, on top of -next:
>>
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/gcc-plugin/randstruct-next-20170519-ipc
>>
>> Manfred, I think I could get to the same results in fewer logical
>> steps, but I'm curious to see what you think of what I've got first.
>> Mainly I've done the following things:
>>
>> - remove unneeded RCU free calls (since the IPC memory is only exposed
>> to RCUness once ipc_addid() succeeds.
>
> This is the step that I'm not 100% sure about:
> What about the security_sem_alloc?
> Is it possible that a security module relies on the RCU?

The LSM can't take a reference to the newly allocated IPC structures
because that would cause *_rcu_free() to never get called, and things
would get pinned in memory.

However, in reading through the code more carefully, I do see that I
was missing freeing the security struct after a successful
security_*_alloc() call. So while RCU doesn't need to be involved, I
do have to more carefully handle the exit paths to correctly free
whatever the LSM allocated on ->perm.security.

I'll fix this, consolidate things a bit, and post my series.

> The series looks good to me, with the exception of the security interface.
> I think your changes are safe, but for that part I'm not certain.
>
> I'll try to do a deeper review/test during the next week, then I would Ack
> the patches.

Awesome, thanks for taking a look. I'll rework my patches and post them.

-Kees

--
Kees Cook
Pixel Security