Re: [PATCH v2] arm64/sve: Fix wrong free for task->thread.sve_state

From: Dave Martin
Date: Mon Sep 30 2019 - 11:34:28 EST


On Mon, Sep 30, 2019 at 10:29:53AM -0400, Masayoshi Mizuma wrote:
> Hi Julien and Dave,
>
> On Mon, Sep 30, 2019 at 02:02:46PM +0100, Dave Martin wrote:
> > On Mon, Sep 30, 2019 at 01:23:18PM +0100, Julien Grall wrote:
> > > Hi,
> > >
> > > On 27/09/2019 16:39, Masayoshi Mizuma wrote:
> > > >From: Masayoshi Mizuma <m.mizuma@xxxxxxxxxxxxxx>
> > > >
> > > >The system which has SVE feature crashed because of
> > > >the memory pointed by task->thread.sve_state was destroyed
> > > >by someone.
> > > >
> > > >That is because sve_state is freed while the forking the
> > > >child process. The child process has the pointer of sve_state
> > > >which is same as the parent's because the child's task_struct
> > > >is copied from the parent's one. If the copy_process()
> > > >fails as an error on somewhere, for example, copy_creds(),
> > > >then the sve_state is freed even if the parent is alive.
> > > >The flow is as follows.
> > > >
> > > >copy_process
> > > > p = dup_task_struct
> > > > => arch_dup_task_struct
> > > > *dst = *src; // copy the entire region.
> > > >:
> > > > retval = copy_creds
> > > > if (retval < 0)
> > > > goto bad_fork_free;
> > > >:
> > > >bad_fork_free:
> > > >...
> > > > delayed_free_task(p);
> > > > => free_task
> > > > => arch_release_task_struct
> > > > => fpsimd_release_task
> > > > => __sve_free
> > > > => kfree(task->thread.sve_state);
> > > > // free the parent's sve_state
> > > >
> > > >Move child's sve_state = NULL and clearing TIF_SVE flag
> > > >to arch_dup_task_struct() so that the child doesn't free the
> > > >parent's one.
> > > >
> > > >Cc: stable@xxxxxxxxxxxxxxx
> > > >Fixes: bc0ee4760364 ("arm64/sve: Core task context handling")
> > >
> > > Looking at the log, it looks like THREAD_INFO_IN_TASK was selected before
> > > the bc0ee4760364. So it should be fine to backport for all the Linux tree
> > > contain this commit.
>
> I think this patch is needed for the kernel has SVE support.
> I'll add the Cc tag as Dave said:
>
> Cc: stable@xxxxxxxxxxxxxxx # 4.15+

Based on running git log --grep='stable@vger.*#.*-', the most common
syntax for this situation seems to be

Cc: <stable@xxxxxxxxxxxxxxx> # 4.15.x-

> So, I suppose this patch will be backported to stables 5.3.X,
> 5.2.X and longterm 4.19.X.
> Does this make sense?

Yes, I think so. It's up to the stable maintainers to decide which
trees the patch actually gets applied too.

> > > >Signed-off-by: Masayoshi Mizuma <m.mizuma@xxxxxxxxxxxxxx>
> > > >Reported-by: Hidetoshi Seto <seto.hidetoshi@xxxxxxxxxxxxxx>
> > > >Suggested-by: Dave Martin <Dave.Martin@xxxxxxx>
> > >
> > > I have tested the patch and can confirm that double-free disappeared after
> > > the patch is applied:
> > >
> > > Tested-by: Julien Grall <julien.grall@xxxxxxx>
>
> Thank you so much!
>
> >
> > Good to have that confirmed -- thanks for verifying.
> >
> > [...]
> >
> > > >---
> > > > arch/arm64/kernel/process.c | 21 ++++-----------------
> > > > 1 file changed, 4 insertions(+), 17 deletions(-)
> > > >
> > > >diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > > >index f674f28df..6937f5935 100644
> > > >--- a/arch/arm64/kernel/process.c
> > > >+++ b/arch/arm64/kernel/process.c
> > > >@@ -323,22 +323,16 @@ void arch_release_task_struct(struct task_struct *tsk)
> > > > fpsimd_release_task(tsk);
> > > > }
> > > >-/*
> > > >- * src and dst may temporarily have aliased sve_state after task_struct
> > > >- * is copied. We cannot fix this properly here, because src may have
> > > >- * live SVE state and dst's thread_info may not exist yet, so tweaking
> > > >- * either src's or dst's TIF_SVE is not safe.
> > > >- *
> > > >- * The unaliasing is done in copy_thread() instead. This works because
> > > >- * dst is not schedulable or traceable until both of these functions
> > > >- * have been called.
> > > >- */
> > >
> > > It would be good to explain in the commit message why tweaking "dst" in
> > > arch_dup_task_struct() is fine.
> > >
> > > From my understanding, Arm64 used to have thread_info on the stack. So it
> > > would not be possible to clear TIF_SVE until the stack is initialized.
> > >
> > > Now that the thread_info is part of the task, it should be valid to modify
> > > the flag from arch_dup_task_struct().
> > >
> > > Note that technically, TIF_SVE does not need to be cleared from
> > > arch_dup_task_struct(). It could also be done from copy_thread(). But it is
> > > easier to keep the both changes together.
>
> Thanks, let me add some comments to the commit log.
>
> > >
> > > > int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
> > > > {
> > > > if (current->mm)
> > > > fpsimd_preserve_current_state();
> > > > *dst = *src;
> >
> > Ack, some more explanation would be a good idea here.
> >
> > Maybe the following comments are sufficient?
> >
> > /* We rely on the above assingment to initialise dst's thread_flags: */
>
> Thanks, I'll add this comment.
>
> >
> > > >+ BUILD_BUG_ON(!IS_ENABLED(CONFIG_THREAD_INFO_IN_TASK));
> > >
> >
> > and
> >
> > /*
> > * Detach src's sve_state (if any) from dst so that it does not
> > * get erroneously used or freed prematurely. dst's sve_state
> > * will be allocated on demand later on if dst uses SVE.
> > * For consistency, also clear TIF_SVE here: this could be done
> > * later in copy_process(), but to avoid tripping up future
> > * maintainers it is best not to leave TIF_SVE and sve_state in
> > * an inconsistent state, even temporarily.
> > */
>
> I'll add this comments.

OK, if these make sense to you, feel free to add them.

[...]

Cheers
---Dave