[RFC kgr on klp 9/9] livepatch: send a fake signal to all tasks
From: Jiri Slaby
Date: Mon May 04 2015 - 07:40:45 EST
From: Miroslav Benes <mbenes@xxxxxxx>
kGraft consistency model is of LEAVE_KERNEL and SWITCH_THREAD. This
means that all tasks in the system have to be marked one by one as safe
to call a new patched function. Safe place is on the boundary between
kernel and userspace. The patching waits for all tasks to cross this
boundary and finishes the process afterwards.
The problem is that a task can block the finalization of patching
process for quite a long time, if not forever. The task could sleep
somewhere in the kernel or could be running in the userspace with no
prospect of entering the kernel and thus going through the safe place.
Luckily we can force the task to do that by sending it a fake signal,
that is a signal with no data in signal pending structures (no handler,
no sign of proper signal delivered). Suspend/freezer use this to
freeze the tasks as well. The task gets TIF_SIGPENDING set and is
woken up (if it has been sleeping in the kernel before) or kicked by
rescheduling IPI (if it was running on other CPU). This causes the task
to go to kernel/userspace boundary where the signal would be handled and
the task would be marked as safe in terms of live patching.
There are tasks which are not affected by this technique though. The
fake signal is not sent to kthreads. They should be handled in a
different way. Also if the task is in TASK_RUNNING state but not
currently running on some CPU it doesn't get the IPI, but it would
eventually handle the signal anyway. Last, if the task runs in the kernel
(in TASK_RUNNING state) it gets the IPI, but the signal is not handled
on return from the interrupt. It would be handled on return to the
userspace in the future.
If the task was sleeping in a syscall it would be woken by our fake
signal, it would check if TIF_SIGPENDING is set (by calling
signal_pending() predicate) and return ERESTART* or EINTR. Syscalls with
ERESTART* return values are restarted in case of the fake signal (see
do_signal()). EINTR is propagated back to the userspace program. This
could disturb the program, but...
* each process dealing with signals should react accordingly to EINTR
return values.
* syscalls returning EINTR happen to be quite common situation in the
system even if no fake signal is sent.
* freezer sends the fake signal and does not deal with EINTR anyhow.
Thus EINTR values are returned when the system is resumed.
The very safe marking is done in entry_64.S on syscall and
interrupt/exception exit paths.
Signed-off-by: Miroslav Benes <mbenes@xxxxxxx>
Reviewed-by: Jiri Kosina <jkosina@xxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Signed-off-by: Jiri Slaby <jslaby@xxxxxxx>
---
kernel/livepatch/cmodel-kgraft.c | 23 +++++++++++++++++++++++
kernel/signal.c | 3 ++-
2 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/kernel/livepatch/cmodel-kgraft.c b/kernel/livepatch/cmodel-kgraft.c
index 196b08823f73..fd041ca30161 100644
--- a/kernel/livepatch/cmodel-kgraft.c
+++ b/kernel/livepatch/cmodel-kgraft.c
@@ -107,6 +107,27 @@ static bool klp_kgraft_still_patching(void)
return failed;
}
+static void klp_kgraft_send_fake_signal(void)
+{
+ struct task_struct *p;
+ unsigned long flags;
+
+ read_lock(&tasklist_lock);
+ for_each_process(p) {
+ /*
+ * send fake signal to all non-kthread processes which are still
+ * not migrated
+ */
+ if (!(p->flags & PF_KTHREAD) &&
+ klp_kgraft_task_in_progress(p) &&
+ lock_task_sighand(p, &flags)) {
+ signal_wake_up(p, 0);
+ unlock_task_sighand(p, &flags);
+ }
+ }
+ read_unlock(&tasklist_lock);
+}
+
static void klp_kgraft_work_fn(struct work_struct *work)
{
static bool printed = false;
@@ -117,6 +138,8 @@ static void klp_kgraft_work_fn(struct work_struct *work)
KGRAFT_TIMEOUT);
printed = true;
}
+ /* send fake signal */
+ klp_kgraft_send_fake_signal();
/* recheck again later */
queue_delayed_work(klp_kgraft_wq, &klp_kgraft_work,
KGRAFT_TIMEOUT * HZ);
diff --git a/kernel/signal.c b/kernel/signal.c
index d51c5ddd855c..5a3f56a69122 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -157,7 +157,8 @@ void recalc_sigpending_and_wake(struct task_struct *t)
void recalc_sigpending(void)
{
- if (!recalc_sigpending_tsk(current) && !freezing(current))
+ if (!recalc_sigpending_tsk(current) && !freezing(current) &&
+ !klp_kgraft_task_in_progress(current))
clear_thread_flag(TIF_SIGPENDING);
}
--
2.3.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/