Re: [PATCH v8 9/9] seccomp: implement SECCOMP_FILTER_FLAG_TSYNC

From: Kees Cook
Date: Wed Jun 25 2014 - 13:57:36 EST


On Wed, Jun 25, 2014 at 10:24 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> On 06/25, Kees Cook wrote:
>>
>> On Wed, Jun 25, 2014 at 9:52 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>> >
>> > Yes, at least this should close the race with suid-exec. And there are no
>> > other users. Except apparmor, and I hope you will check it because I simply
>> > do not know what it does ;)
>> >
>> >> I wonder if changes to nnp need to "flushed" during syscall entry
>> >> instead of getting updated externally/asynchronously? That way it
>> >> won't be out of sync with the seccomp mode/filters.
>> >>
>> >> Perhaps secure computing needs to check some (maybe seccomp-only)
>> >> atomic flags and flip on the "real" nnp if found?
>> >
>> > Not sure I understand you, could you clarify?
>>
>> Instead of having TSYNC change the nnp bit, it can set a new flag, say:
>>
>> task->seccomp.flags |= SECCOMP_NEEDS_NNP;
>>
>> This would be set along with seccomp.mode, seccomp.filter, and
>> TIF_SECCOMP. Then, during the next secure_computing() call that thread
>> makes, it would check the flag:
>>
>> if (task->seccomp.flags & SECCOMP_NEEDS_NNP)
>> task->nnp = 1;
>>
>> This means that nnp couldn't change in the middle of a running syscall.
>
> Aha, so you were worried about the same thing. Not sure we need this,
> but at least I understand you and...
>
>> Hmmm. Perhaps this doesn't solve anything, though? Perhaps my proposal
>> above would actually make things worse, since now we'd have a thread
>> with seccomp set up, and no nnp. If it was in the middle of exec,
>> we're still causing a problem.
>
> Yes ;)
>
>> I think we'd also need a way to either delay the seccomp changes, or
>> to notice this condition during exec. Bleh.
>
> Hmm. confused again,

I mean to suggest that the tsync changes would be stored in each
thread, but somewhere other than the true seccomp struct, but with
TIF_SECCOMP set. When entering secure_computing(), current would check
for the "changes to sync", and apply them, then start the syscall. In
this way, we can never race a syscall (like exec).

>> What actually happens with a multi-threaded process calls exec? I
>> assume all the other threads are destroyed?
>
> Yes. But this is the point-of-no-return, de_thread() is called after the execing
> thared has already passed (say) check_unsafe_exec().
>
> However, do_execve() takes cred_guard_mutex at the start in prepare_bprm_creds()
> and drops it in install_exec_creds(), so it should solve the problem?

I can't tell yet. I'm still trying to understand the order of
operations here. It looks like de_thread() takes the sighand lock.
do_execve_common does:

prepare_bprm_creds (takes cred_guard_mutex)
check_unsafe_exec (checks nnp to set LSM_UNSAFE_NO_NEW_PRIVS)
prepare_binprm (handles suid escalation, checks nnp separately)
security_bprm_set_creds (checks LSM_UNSAFE_NO_NEW_PRIVS)
exec_binprm
load_elf_binary
flush_old_exec
de_thread (takes and releases sighand->lock)
install_exec_creds (releases cred_guard_mutex)

I don't see a way to use cred_guard_mutex during tsync (which holds
sighand->lock) without dead-locking. What were you considering here?

-Kees

--
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/