Re: [PATCH] proc: restrict kernel stack dumps to root

From: Kees Cook
Date: Wed Sep 12 2018 - 18:28:02 EST


On Wed, Sep 12, 2018 at 8:29 AM, Jann Horn <jannh@xxxxxxxxxx> wrote:
> +linux-api, I guess
>
> On Tue, Sep 11, 2018 at 8:39 PM Jann Horn <jannh@xxxxxxxxxx> wrote:
>>
>> Restrict the ability to inspect kernel stacks of arbitrary tasks to root
>> in order to prevent a local attacker from exploiting racy stack unwinding
>> to leak kernel task stack contents.
>> See the added comment for a longer rationale.
>>
>> There don't seem to be any users of this userspace API that can't
>> gracefully bail out if reading from the file fails. Therefore, I believe
>> that this change is unlikely to break things.
>> In the case that this patch does end up needing a revert, the next-best
>> solution might be to fake a single-entry stack based on wchan.
>>
>> Fixes: 2ec220e27f50 ("proc: add /proc/*/stack")
>> Cc: stable@xxxxxxxxxxxxxxx
>> Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
>> ---
>> fs/proc/base.c | 14 ++++++++++++++
>> 1 file changed, 14 insertions(+)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index ccf86f16d9f0..7e9f07bf260d 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -407,6 +407,20 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
>> unsigned long *entries;
>> int err;
>>
>> + /*
>> + * The ability to racily run the kernel stack unwinder on a running task
>> + * and then observe the unwinder output is scary; while it is useful for
>> + * debugging kernel issues, it can also allow an attacker to leak kernel
>> + * stack contents.
>> + * Doing this in a manner that is at least safe from races would require
>> + * some work to ensure that the remote task can not be scheduled; and
>> + * even then, this would still expose the unwinder as local attack
>> + * surface.
>> + * Therefore, this interface is restricted to root.
>> + */
>> + if (!file_ns_capable(m->file, &init_user_ns, CAP_SYS_ADMIN))
>> + return -EACCES;

In the past, we've avoided hard errors like this in favor of just
censoring the output. Do we want to be more cautious here? (i.e.
return 0 or a fuller seq_printf(m, "[<0>] privileged\n"); return 0;)

>> +
>> entries = kmalloc_array(MAX_STACK_TRACE_DEPTH, sizeof(*entries),
>> GFP_KERNEL);
>> if (!entries)
>> --
>> 2.19.0.rc2.392.g5ba43deb5a-goog
>>

-Kees

--
Kees Cook
Pixel Security