Re: Question: livepatch failed for new fork() task stack unreliable

From: Josh Poimboeuf
Date: Fri May 29 2020 - 13:44:49 EST


On Fri, May 29, 2020 at 06:10:59PM +0800, Wang ShaoBo wrote:
> Stack unreliable error is reported by stack_trace_save_tsk_reliable() when trying
> to insmod a hot patch for module modification, this results in frequent failures
> sometimes. We found this 'unreliable' stack is from task just fork.

For livepatch, this shouldn't actually be a failure. The patch will
just stay in the transition state until after the fork has completed.
Which should happen in a reasonable amount of time, right?

> 1) The task was not actually scheduled to excute, at this time UNWIND_HINT_EMPTY in
> ret_from_fork() has not reset unwind_hint, it's sp_reg and end field remain default value
> and end up throwing an error in unwind_next_frame() when called by arch_stack_walk_reliable();

Yes, this seems to be true for forked-but-not-yet-scheduled tasks.

I can look at fixing that. I have some ORC cleanups in progress which
are related to UNWIND_HINT_EMPTY and the end of the stack. I can add
this issue to the list of improvements.

> 2) The task has been scheduled but UNWIND_HINT_REGS not finished, at this time
> arch_stack_walk_reliable() terminates it's backtracing loop for pt_regs unknown
> and return -EINVAL because it's a user task.

Hm, do you see this problem with upstream? It seems like it should
work. arch_stack_walk_reliable() has this:

/* Success path for user tasks */
if (user_mode(regs))
return 0;

Where exactly is the error coming from?

--
Josh