Re: ptrace() hangs on attempt to seize/attach stopped & frozen task

From: Pedro Alves
Date: Thu Nov 19 2015 - 11:50:01 EST

On 11/17/2015 07:34 PM, Oleg Nesterov wrote:
> On 11/16, Tejun Heo wrote:

>>> And perhaps we can simply remove this logic? I forgot why do we hide this
>>> STOPPED -> RUNNING -> TRACED transition from the attaching thread. But the
>>> vague feeling tells me that we discussed this before and perhaps it was me
>>> who suggested to avoid the user-visible change when you introduced this
>>> transition...
>> Heh, it was too long ago for me to remember much. :)
> Same here...
>>> Anyway, now I do not understand why do we want to hide it. Lets consider
>>> the following "test-case",
>>> void test(int pid)
>>> {
>>> kill(pid, SIGSTOP);
>>> waitpid(pid, NULL, WSTOPPED);
>>> ptrace(PTRACE_ATTACH-OR-PTRACE_SEIZE, pid, 0,0);
>>> assert(ptrace(PTRACE_DETACH, pid, 0,0) == 0);
>>> }
>>> Yes, it will fail if we remove JOBCTL_TRAPPING. But it can equally fail
>>> if SIGCONT comes before ATTACH, so perhaps we do not really care?
>>> Jan, Pedro, do you think the patch below can break gdb somehow? With this
>>> patch you can never assume that waitpid(WNOHANG) or ptrace(WHATEVER) will
>>> succeed right after PTRACE_ATTACH/PTRACE_SEIZE, even if you know that the
>>> tracee was TASK_STOPPED before attach.

Not sure, because I don't think I fully understand that proposed change.

Both GDB and gdbserver have special processing for attaching to already-stopped
processes. (and neither use PTRACE_SEIZE yet.)

Here's the gdbserver version:;a=blob;f=gdb/gdbserver/linux-low.c;h=41ab510fa4ac5654f101f08efb68e26b5bc5dbd7;hb=HEAD#l903

Copied here for convenience:

907 linux_attach_lwp (ptid_t ptid)
908 {
909 struct lwp_info *new_lwp;
910 int lwpid = ptid_get_lwp (ptid);
912 if (ptrace (PTRACE_ATTACH, lwpid, (PTRACE_TYPE_ARG3) 0, (PTRACE_TYPE_ARG4) 0)
913 != 0)
914 return errno;
916 new_lwp = add_lwp (ptid);
918 /* We need to wait for SIGSTOP before being able to make the next
919 ptrace call on this LWP. */
920 new_lwp->must_set_ptrace_flags = 1;
922 if (linux_proc_pid_is_stopped (lwpid))
923 {
924 if (debug_threads)
925 debug_printf ("Attached to a stopped process\n");
927 /* The process is definitely stopped. It is in a job control
928 stop, unless the kernel predates the TASK_STOPPED /
929 TASK_TRACED distinction, in which case it might be in a
930 ptrace stop. Make sure it is in a ptrace stop; from there we
931 can kill it, signal it, et cetera.
933 First make sure there is a pending SIGSTOP. Since we are
934 already attached, the process can not transition from stopped
935 to running without a PTRACE_CONT; so we know this signal will
936 go into the queue. The SIGSTOP generated by PTRACE_ATTACH is
937 probably already in the queue (unless this kernel is old
938 enough to use TASK_STOPPED for ptrace stops); but since
939 SIGSTOP is not an RT signal, it can only be queued once. */
940 kill_lwp (lwpid, SIGSTOP);
942 /* Finally, resume the stopped process. This will deliver the
943 SIGSTOP (or a higher priority signal, just like normal
944 PTRACE_ATTACH), which we'll catch later on. */
945 ptrace (PTRACE_CONT, lwpid, (PTRACE_TYPE_ARG3) 0, (PTRACE_TYPE_ARG4) 0);
946 }
948 /* The next time we wait for this LWP we'll see a SIGSTOP as PTRACE_ATTACH
949 brings it to a halt.

linux_proc_pid_is_stopped checks whether the state in /proc/pid/status is "T (stopped)".

Here's the equivalent in gdb:;a=blob;f=gdb/linux-nat.c;h=841ec3949c37438dfba924d8db6b37ffc416dd29;hb=HEAD#l974

This queuing of a SIGSTOP + PTRACE_CONT was necessary because
otherwise when gdb attaches to a job stopped process, gdb would hang in the waitpid
after PTRACE_ATTACH, waiting for the initial SIGSTOP which would never arrive.

If the proposed change makes it so that a new intermediate state can be observed
right after PTRACE_ATTACH, and so linux_proc_pid_is_stopped can return false,
then there's potential for breakage. But maybe not, if we're sure that
that when that happens, waitpid returns for the initial

Pedro Alves

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at