Re: [PATCH] coredump: allow PTRACE_ATTACH to coredump user mode helper

From: Oleg Nesterov
Date: Thu Jul 08 2021 - 08:02:27 EST


On 07/05, Vladimir Divjak wrote:
>
> * Problem description / Rationale:
> In automotive and/or embedded environments,
> the storage capacity to store, and/or
> network capabilities to upload
> a complete core file can easily be a limiting factor,
> making offline issue analysis difficult.

To be honest, I don't like the idea... plus the implementation looks
horrible to me, sorry.

Can't the coredump helper process simply do
ptrace(PTRACE_SEIZE, PTRACE_O_TRACEEXIT), close the pipe, and wait
for PTRACE_EVENT_EXIT ? Then it can use ptrace() as usual.

> +void cdh_unlink_current(void)
> +{
> + struct cdh_entry *entry, *next;
> +
> + mutex_lock(&cdh_mutex);
> + list_for_each_entry_safe(entry, next, &cdh_list, cdh_list_link) {

Why _safe ?

> +bool cdh_ptrace_allowed(struct task_struct *task)
> +{
> + struct cdh_entry *entry;
> +
> + mutex_lock(&cdh_mutex);
> + list_for_each_entry(entry, &cdh_list, cdh_list_link) {
> + if (task_tgid_nr(entry->task_being_dumped) == task_tgid_nr(task)
> + && entry->helper_pid == task_tgid_nr(current)) {
> + reinit_completion(&(entry->ptrace_done));
> + wait_task_inactive(entry->task_being_dumped, 0);

So. IIUC, this assumes that when cdh_ptrace_allowed() returns the dumping
process must be blocked in dump_emit()->wait_for_completion(ptrace_done).
And thus ptrace_attach() can safely do task->state = TASK_TRACED.

But it is possible that __dump_emit() has already failed and task_being_dumped
sleeps in cdh_unlink_current() waiting for cdh_mutex. So it will be running
right after cdh_ptrace_allowed() drops cdh_mutex.

> +struct cdh_entry *cdh_get_entry_for_current(void)
> +{
> + struct cdh_entry *entry;
> +
> + list_for_each_entry(entry, &cdh_list, cdh_list_link) {
> + if (entry->task_being_dumped == current)
> + return entry;

Why is it safe without cdh_mutex ?

> @@ -361,6 +362,8 @@ static int ptrace_attach(struct task_struct *task, long request,
> {
> bool seize = (request == PTRACE_SEIZE);
> int retval;
> + bool core_state = false;
> + bool core_trace_allowed = false;
>
> retval = -EIO;
> if (seize) {
> @@ -392,10 +395,17 @@ static int ptrace_attach(struct task_struct *task, long request,
>
> task_lock(task);
> retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS);
> + if (unlikely(task->mm->core_state))
> + core_state = true;

task->mm can be NULL

> + if (!seize && unlikely(core_state)) {
> + if (cdh_ptrace_allowed(task))
> + core_trace_allowed = true;
> + }

Why !seize ???

What if ptrace_attach() fails after that? Who will wake this task up ?

> + /*
> + * Core state process does not process signals normally.
> + * set directly to TASK_TRACED if allowed by cdh_ptrace_allowed.
> + */
> + if (core_trace_allowed)
> + task->state = TASK_TRACED;

See above.

But even if I missed something, this is wrong no matter what, you should
never change another task's state.

> @@ -821,6 +838,8 @@ static int ptrace_resume(struct task_struct *child, long request,
> {
> bool need_siglock;
>
> + cdh_signal_continue(child);

takes cdh_mutex :/

Oleg.