Re: [PATCH v2] do_exit(): Make sure we run with get_fs() ==USER_DS.

From: Andrew Morton
Date: Wed Dec 01 2010 - 20:12:43 EST


On Tue, 30 Nov 2010 21:27:36 -0500
Nelson Elhage <nelhage@xxxxxxxxxxx> wrote:

> If a user manages to trigger an oops with fs set to KERNEL_DS, fs is not
> otherwise reset before do_exit(). do_exit may later (via mm_release in fork.c)
> do a put_user to a user-controlled address, potentially allowing a user to
> leverage an oops into a controlled write into kernel memory.
>
> A more logical place to put this might be when we know an oops has occurred,
> before we call do_exit(), but that would involve changing every architecture, in
> multiple places. Let's just stick it in do_exit instead.
>
> Signed-off-by: Nelson Elhage <nelhage@xxxxxxxxxxx>
> ---
> kernel/exit.c | 8 ++++++++
> 1 files changed, 8 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 21aa7b3..68899b3 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -914,6 +914,14 @@ NORET_TYPE void do_exit(long code)
> if (unlikely(!tsk->pid))
> panic("Attempted to kill the idle task!");
>
> + /*
> + * If do_exit is called because this processes oopsed, it's possible
> + * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before
> + * continuing. This is relevant at least for clearing clear_child_tid in
> + * mm_release.
> + */
> + set_fs(USER_DS);
> +
> tracehook_report_exit(&code);
>
> validate_creds_for_do_exit(tsk);

I think that the potential of escalating an oops or a BUG into a local
root hole is pretty serious so I'll send this fix along for 2.6.37 and
I tagged it for -stable backporting, along with a sterner-sounding
changelog.



From: Nelson Elhage <nelhage@xxxxxxxxxxx>

If a user manages to trigger an oops with fs set to KERNEL_DS, fs is not
otherwise reset before do_exit(). do_exit may later (via mm_release in
fork.c) do a put_user to a user-controlled address, potentially allowing a
user to leverage an oops into a controlled write into kernel memory.

This is only triggerable in the presence of another bug, but this
potentially turns a lot of DoS bugs into privilege escalations, so it's
worth fixing. I have proof-of-concept code which uses this bug along with
CVE-2010-3849 to write a zero to an arbitrary kernel address, so I've
tested that this is not theoretical.


A more logical place to put this fix might be when we know an oops has
occurred, before we call do_exit(), but that would involve changing every
architecture, in multiple places. Let's just stick it in do_exit instead.

[akpm@xxxxxxxxxxxxxxxxxxxx: update code comment]
Signed-off-by: Nelson Elhage <nelhage@xxxxxxxxxxx>
Cc: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
Cc: <stable@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

kernel/exit.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff -puN kernel/exit.c~do_exit-make-sure-we-run-with-get_fs-==-user_ds kernel/exit.c
--- a/kernel/exit.c~do_exit-make-sure-we-run-with-get_fs-==-user_ds
+++ a/kernel/exit.c
@@ -914,6 +914,15 @@ NORET_TYPE void do_exit(long code)
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");

+ /*
+ * If do_exit is called because this processes oopsed, it's possible
+ * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before
+ * continuing. Amongst other possible reasons, this is to prevent
+ * mm_release()->clear_child_tid() from writing to a user-controlled
+ * kernel address.
+ */
+ set_fs(USER_DS);
+
tracehook_report_exit(&code);

validate_creds_for_do_exit(tsk);
_

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/