Re: [PATCH] posix-cpu-timers: Cleanup CPU timers before freeing them during exec

From: Eric W. Biederman
Date: Tue Aug 09 2022 - 15:04:09 EST


Thadeu Lima de Souza Cascardo <cascardo@xxxxxxxxxxxxx> writes:

> Commit 55e8c8eb2c7b ("posix-cpu-timers: Store a reference to a pid not a
> task") started looking up tasks by PID when deleting a CPU timer.
>
> When a non-leader thread calls execve, it will switch PIDs with the leader
> process. Then, as it calls exit_itimers, posix_cpu_timer_del cannot find
> the task because the timer still points out to the old PID.


I think this description is missing something.

Looking at how clock_pid_type selects which task to go through
to obtain the sighand lock, and the fact that the sighand_struct
can change during exec all make me think that this change isn't
necessarily wrong, I am just trying to understand what is going
on that makes this necessary.

The function cpu_timer_task_rcu should return the one remaining task if
it is process wide timer, as all of the other threads have been reaped
after de_thread.

For a per thread timer for the surviving thread I can see exchange_tids
causing clock_pid_type to returning the threads old pid, and which
exchange_tids attached to a task that de_thread has freed with
release_task.

If that analysis is correct I think your change is safe only
because posix_cpu_timers_exit does not do anything with the pids.

Perhaps it would be better to do something like the diff below. That
is always call posix_cpu_timers_exit before exchange_tids can run. That
way there is nothing clever going on for us to stumble over later.

Once long ago I tried to remove the pid swap but unfortunately the
glibc pthread relies on the fact that getpid() == gettid() for the
first thread after exec. Sigh.

diff --git a/fs/exec.c b/fs/exec.c
index 45914e57c0d5..a2a0b3faf603 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1072,6 +1072,10 @@ static int de_thread(struct task_struct *tsk)
if (!thread_group_leader(tsk))
sig->notify_count--;

+#ifdef CONFIG_POSIX_TIMERS
+ /* Cleanup the per thread timers before the pid changes */
+ posix_cpu_timers_exit(tsk);
+#endif
while (sig->notify_count) {
__set_current_state(TASK_KILLABLE);
spin_unlock_irq(lock);
diff --git a/kernel/exit.c b/kernel/exit.c
index 4f7424523bac..f7e19b73cf6c 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -104,7 +104,6 @@ static void __exit_signal(struct task_struct *tsk)
spin_lock(&sighand->siglock);

#ifdef CONFIG_POSIX_TIMERS
- posix_cpu_timers_exit(tsk);
if (group_dead)
posix_cpu_timers_exit_group(tsk);
#endif
@@ -772,6 +771,12 @@ void __noreturn do_exit(long code)
if (tsk->mm)
sync_mm_rss(tsk->mm);
acct_update_integrals(tsk);
+#ifdef CONFIG_POSIX_TIMERS
+ /* Cleanup the per thread timers before de_thread can change the pid */
+ spin_lock_irq(&tsk->sighand->siglock);
+ posix_cpu_timers_exit(tsk);
+ spin_unlock_irq(&tsk->sighand->siglock);
+#endif
group_dead = atomic_dec_and_test(&tsk->signal->live);
if (group_dead) {
/*

Eric