Re: Ptrace documentation, draft #1

From: Denys Vlasenko
Date: Fri May 20 2011 - 14:02:37 EST


On Thu, May 19, 2011 at 9:49 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> On 05/18, Denys Vlasenko wrote:
>>
>> On Mon, May 16, 2011 at 5:31 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>> >
>> > Note: currently a killed PT_TRACE_EXIT tracee can stop and report
>> > PTRACE_EVENT_EXIT before it actually exits. I'd say this is wrong and
>> > should be fixed.
>>
>> Yes, I assumed this is normal.
>> Or do you mean that *killed* tracee (that is, by signal) also stops there?
>
> Yes.

Thanks, noted.

>> >> Tracer can kill a tracee with ptrace(PTRACE_KILL, pid, 0, 0).
>> >
>> > Oh, no. This is more or less equivalent to PTRACE_CONT(SIGKILL) except
>> > PTRACE_KILL doesn't return the error if the tracee is not stopped.
>> >
>> > I'd say: do not use PTRACE_KILL, never. If the tracer wants to kill
>> > the tracee - kill or tkill should be used.
>>
>> Regardless. We need to tell users what to expect after they do PTRACE_KILL.
>
> Once again, PTRACE_KILL == ptrace(PTRACE_CONT, SIGKILL), except it
> doesn't return the error if the tracee is not stopped.

Oleg, this doesn't explain the resulting behavior in terms understandable
to mere mortals. *What will happen* when user does ptrace(PTRACE_KILL)?

Yes, it's obvious that the tracee gets SIGKILLed, but will it report WIFSIGNALED
or not? Userspace folks won't be 100.00% sure if we won't be exact about it.

They may think "hmm... maybe this PTRACE_KILL thing is so powerful it makes
it unnecessary to waitpid for the nuked process?", which actualy isn't such
a stupid hupothesis - if tracer itself PTRACE_KILL's tracee, it doesn't want
to know about it anymore, so why should it waitpid for it?


>> >> When any thread executes exit_group syscall, every tracee reports its
>> >> death to its tracer.
>> >>
>> >> ??? Is it true that *every* thread reports death?
>> >
>> > Yes, if you mean do_wait() as above.
>>
>> And will PTRACE_EVENT_EXIT happen for *every* tracee (which has it configured)?
>
> Oh. This depends on /dev/random. Most probably the exiting tracee
> dequeues the (implicit) SIGKILL and report PTRACE_EVENT_EXIT. Oh,
> unless arch_ptrace_stop_needed() is true. But it can exit on its own
> or deque another fatal signal, then it won't stop because of
> fatal_signal_pending().
>
> In short: this should be fixed. We already discussed this a bit (many
> times ;), first of all we should define the correct behaviour. If you
> ask me, personally I think PTRACE_EVENT_EXIT should be always reported
> unless the task was explicitly killed by SIGKILL. But this is not clear.

Documented with "KNOWN BUG:" tag.


>> >> Kernel delivers an extra SIGTRAP to tracee after execve syscall
>> >> returns. This is an ordinary signal (similar to one generated by kill
>> >> -TRAP), not a special kind of ptrace-stop. If PTRACE_O_TRACEEXEC option
>> >> is in effect, a PTRACE_EVENT_EXEC-stop is generated instead.
>> >>
>> >> ??? can this SIGTRAP be distinguished from "real" user-generated SIGTRAP
>> >>     by looking at its siginfo?
>> >
>> > Afaics no. Well, except .si_pid shows that the signal was sent by the
>> > tracing process to itself.
>>
>> What about si_code? Is it set to SI_KERNEL for this signal?
>
> No, SI_USER.

This is stupid. This signal is sent by kernel. Why is it flagged as "from user"?
Maybe we should change it?

(BTW, where is it generated in the kernel source? I found
PTRACE_EVENT_EXEC generation, but failed to find
"old-school SIGTRAP" generation code...)


>> >> ??? Are syscalls interrupted by signals which are suppressed by tracer?
>> >>     If yes, document it here
>> >
>> > Please reiterate, can't understand.
>>
>> Let's say tracee is in nanosleep. Then some signal arrives,
>
> note that the tracee is already interrupted here, sys_nanosleep()
> returns ERESTART_RESTARTBLOCK.
>
>> but tracer decides to ignore it. In tracer:
>>
>> waitpid: WIFSTOPPED, WSTOPSIG = some_sig  <===
>> ptrace(PTRACE_CONT, pid, 0, 0)  ===>
>>
>> will this interrupt nanosleep in tracee?
>
> Yes and no. Once again, the tracee already returned from sys_nanosleep,
> but it will restart this syscall (actually, it will do sys_restart_syscall)
> and continue to sleep.

Documented as such.


>> >>       ptrace(PTRACE_cmd, pid, 0, sig);
>> >> where cmd is CONT, DETACH, SYSCALL, SINGLESTEP, SYSEMU,
>> >> SYSEMU_SINGLESTEP. If tracee is in signal-delivery-stop, sig is the
>> >> signal to be injected. Otherwise, sig is ignored.
>> >
>> > There is another special case. If the tracee single-stepps into the
>> > signal handler, it reports SIGTRAP as if it recieved this SIGNAL.
>> > But ptrace(PTRACE, ..., sig) doesn't inject after that.
>>
>> This is part of missing doc about PTRACE_SINGLESTEP.
>> From what you are saying it looks like PTRACE_SINGLESTEP
>> implies PTRACE_SYSCALL behavior: "report syscall-stops".
>
> Hmm. Why do you think so?

I am totally unfamiliar with PTRACE_SINGLESTEP.


Thanks! Expect draft #3 soon.
--
vda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/