On Sat, May 30, 2020 at 10:21:19AM +0800, Wangshaobo (bobo) wrote:
1) when a user mode task just fork start excuting ret_from_fork() tillYes, makes sense.
schedule_tail, unwind_next_frame found
orc->sp_reg is ORC_REG_UNDEFINED but orc->end not equals zero, this time
arch_stack_walk_reliable()
terminates it's backtracing loop for unwind_done() return true. then 'if
(!(task->flags & (PF_KTHREAD | PF_IDLE)))'
in arch_stack_walk_reliable() true and return -EINVAL after.
* The stack trace looks like that:
ret_from_fork
ÂÂÂÂÂ -=> UNWIND_HINT_EMPTY
ÂÂÂÂÂ -=> schedule_tailÂÂÂÂÂÂÂÂÂÂÂÂ /* schedule out */
ÂÂÂÂÂ ...
ÂÂÂÂÂ -=> UNWIND_HINT_REGSÂÂÂÂÂ /*Â UNDO */
2) when using call_usermodehelper_exec_async() to create a user mode task,I don't quite follow the stacktrace, but it sounds like the issue is the
ret_from_fork() still not exec whereas
the task has been scheduled in __schedule(), at this time, orc->sp_reg is
ORC_REG_UNDEFINED but orc->end equals zero,
unwind_error() return true and also terminates arch_stack_walk_reliable()'s
backtracing loop, end up return from
'if (unwind_error())' branch.
* The stack trace looks like that:
-=> call_usermodehelper_exec
       Â -=> do_exec
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -=> search_binary_handler
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -=> load_elf_binary
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -=> elf_map
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -=> vm_mmap_pgoff
-=> down_write_killable
-=> _cond_resched
ÂÂÂÂÂÂÂÂÂÂÂÂ -=> __scheduleÂÂÂÂÂÂÂÂÂÂ /* scheduled to work */
-=> ret_from_forkÂÂÂÂÂÂ /* UNDO */
same as the first one you originally reported:
1) The task was not actually scheduled to excute, at this timeOr am I misunderstanding?
UNWIND_HINT_EMPTY in ret_from_fork() has not reset unwind_hint, it's
sp_reg and end field remain default value and end up throwing an error
in unwind_next_frame() when called by arch_stack_walk_reliable();
And to reiterate, these are not "livepatch failures", right? Livepatch
doesn't fail when stack_trace_save_tsk_reliable() returns an error. It
recovers gracefully and tries again later.