Re: [PATCH v3 13/15] livepatch: change to a per-task consistency model

From: Miroslav Benes
Date: Wed Jan 04 2017 - 08:44:59 EST


On Thu, 8 Dec 2016, Josh Poimboeuf wrote:

> +void klp_start_transition(void)
> +{
> + struct task_struct *g, *task;
> + unsigned int cpu;
> +
> + WARN_ON_ONCE(klp_target_state == KLP_UNDEFINED);
> +
> + pr_notice("'%s': %s...\n", klp_transition_patch->mod->name,
> + klp_target_state == KLP_PATCHED ? "patching" : "unpatching");
> +
> + /*
> + * If the patch can be applied or reverted immediately, skip the
> + * per-task transitions.
> + */
> + if (klp_transition_patch->immediate)
> + return;
> +
> + /*
> + * Mark all normal tasks as needing a patch state update. As they pass
> + * through the syscall barrier they'll switch over to the target state
> + * (unless we switch them in klp_try_complete_transition() first).
> + */
> + read_lock(&tasklist_lock);
> + for_each_process_thread(g, task)
> + set_tsk_thread_flag(task, TIF_PATCH_PENDING);
> + read_unlock(&tasklist_lock);
> +
> + /*
> + * Ditto for the idle "swapper" tasks, though they never cross the
> + * syscall barrier. Instead they switch over in cpu_idle_loop().

...or we switch them in klp_try_complete_transition() first by looking at
their stacks, right? I would add it to the comment.

> + */
> + get_online_cpus();
> + for_each_online_cpu(cpu)
> + set_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING);
> + put_online_cpus();
> +}

[...]

> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -9,6 +9,7 @@
> #include <linux/mm.h>
> #include <linux/stackprotector.h>
> #include <linux/suspend.h>
> +#include <linux/livepatch.h>
>
> #include <asm/tlb.h>
>
> @@ -264,6 +265,9 @@ static void do_idle(void)
>
> sched_ttwu_pending();
> schedule_preempt_disabled();
> +
> + if (unlikely(klp_patch_pending(current)))
> + klp_update_patch_state(current);
> }

I think that (theoretically) this is not sufficient, if we patch a
function present on an idle task's stack and one of the two following
scenarios happen.

1. there is nothing to schedule on a cpu and an idle task does not leave a
loop in do_idle() for some time. It may be a nonsense practically and if
it is not we could solve with schedule_on_each_cpu() on an empty stub
somewhere in our code.

2. there is a cpu-bound process running on one of the cpus. No chance of
going to do_idle() there at all and the idle task would block the
patching. We ran into it in kGraft and I tried to solve it with this new
hunk in pick_next_task()...

+ /*
+ * Patching is in progress, schedule an idle task to migrate it
+ */
+ if (kgr_in_progress_branch()) {
+ if (!test_bit(0, kgr_immutable) &&
+ klp_kgraft_task_in_progress(rq->idle)) {
+ p = idle_sched_class.pick_next_task(rq, prev);
+
+ return p;
+ }
+ }

(kgr_in_progress_branch() is a static key basically. kgr_immutable flag
solves something we don't have a problem with in upstream livepatch thanks
to a combination of task->patch_state and klp_func->transition.
klp_kgraft_task_in_progress() checks the livepatching TIF of a task.)

It is not tested properly and it is a hack as hell so take it as that.
Also note that the problem in kGraft is more serious as we don't have a
stack checking there. So any livepatch could cause the issue easily.

I can imagine even crazier solutions but nothing nice and pretty (which is
probably impossible because the whole idea to deliberately schedule an
idle task is not nice and pretty).

Otherwise the patch looks good to me. I don't understand how Petr found
those races there.

Regards,
Miroslav