Re: [PATCH] fork: fix error handling in dup_task()

From: Chris Metcalf
Date: Fri Jul 13 2012 - 12:46:08 EST


On 7/13/2012 6:07 AM, Akinobu Mita wrote:
> 2012/7/13 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>:
>> On Thu, 12 Jul 2012 20:04:53 +0900
>> Akinobu Mita <akinobu.mita@xxxxxxxxx> wrote:
>>
>>> The function dup_task() may fail at the following function calls in
>>> the following order.
>>>
>>> 0) alloc_task_struct_node()
>>> 1) alloc_thread_info_node()
>>> 2) arch_dup_task_struct()
>>>
>>> Error by 0) is not a matter, it can just return. But error by 1)
>>> requires releasing task_struct allocated by 0) before it returns.
>>> Likewise, error by 2) requires releasing task_struct and thread_info
>>> allocated by 0) and 1).
>>>
>>> The existing error handling calls free_task_struct() and
>>> free_thread_info() which do not only release task_struct and
>>> thread_info, but also call architecture specific
>>> arch_release_task_struct() and arch_release_thread_info().
>>>
>>> The problem is that task_struct and thread_info are not fully
>>> initialized yet at this point, but arch_release_task_struct() and
>>> arch_release_thread_info() are called with them.
>>>
>>> For example, x86 defines its own arch_release_task_struct() that
>>> releases a task_xstate. If alloc_thread_info_node() fails in
>>> dup_task(), arch_release_task_struct() is called with task_struct
>>> which is just allocated and filled with garbage in this error handling.
>>>
>>> This actually happened with tools/testing/fault-injection/failcmd.sh
>>>
>>> # env FAILCMD_TYPE=fail_page_alloc \
>>> ./tools/testing/fault-injection/failcmd.sh --times=100 \
>>> --min-order=0 --ignore-gfp-wait=0 \
>>> -- make -C tools/testing/selftests/ run_tests
>>>
>>> In order to fix this issue, make free_{task_struct,thread_info}() not
>>> to call arch_release_{task_struct,thread_info}() and call
>>> arch_release_{task_struct,thread_info}() implicitly where needed.
>>>
>>> Default arch_release_task_struct() and arch_release_thread_info() are
>>> defined as empty by default. So this change only affects the
>>> architectures which implement their own arch_release_task_struct() or
>>> arch_release_thread_info() as listed below.
>> This conflicts with Salman's fix (below) which is in linux-next via
>> Ingo's tree.
>>
>> It appears that we should drop Salman's patch altogether and use yours?
> Yes. Salman's patch fixes error handling for x86, sh, and mn10300.
> But it doesn't fix for tile. If tile's arch_release_thread_info() is
> called after setup_thread_stack(tsk, orig), it may release original
> task's step_state. (tsk->step_state will be cleared by tile's
> copy_thread() after dup_task_struct()).

Yes, I think Akinobu's patch is better. Thanks.

--
Chris Metcalf, Tilera Corp.
http://www.tilera.com



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/