Re: PTRACE_POKEDATA on PROT_NONE hangs kernel

Michael Elizabeth Chastain (mec@shout.net)
Mon, 21 Sep 1998 21:02:22 -0500


Hi Linus and linux-kernel,

Linus wrote:
> Fix for the above, take two:
>
> - if (!pte_write(*pgtable)) {
> + if (!pte_dirty(*pgtable)) {
> handle_mm_fault(tsk, vma, addr, 1);
> goto repeat;
> }

I tried this patch and it behaves the same as stock 2.1.122. I'm still
getting an infinite loop in put_long. Specifically, I have a pte of
0x00d48062, which is !_PAGE_PRESENT and _PAGE_PROTNONE. With this test
in place, that piece of code does call handle_mm_fault. The pte value
is 0x00d48062 before handle_mm_fault, and it is still 0x00d48062 after
handle_mm_fault. Infinite loop.

I think the problem is deeper than changing a one-line test. Here is
tonight's analysis. :)

---

I read some more code and I read my 486 and Pentium Pro databooks and I thought some more and I have a more coherent explanation. Check me on this, please.

(1) On an x86 CPU, "there is no way to say "this page is present, but cannot be accessed from user or kernel mode"". Therefore, a page that is mmap'ed with PROT_NONE gets a pte with _PAGE_PRESENT clear and _PAGE_PROTNONE set. This pte is not mapped to a physical page.

(2) The i386 implementation of the pte_present macro returns true if either _PAGE_PRESENT is set or _PAGE_PROTNONE is set. Thus these pte's are considered "present". This assumption is built into mm/memory.c in functions such as handle_mm_fault, handle_pte_fault, do_no_page, and do_wp_page.

(3) When a process generates a page fault, do_page_fault in arch/i386/mm/fault.c checks vma access rights before calling handle_mm_fault. For a _PAGE_PROTNONE page, the access right check always fails, and do_page_fault does not call handle_mm_fault. do_page_fault generates a SIGSEGV for a user-space access or a -EFAULT for a system-call access.

Ok, have I got that right?

All that machinery looks well-designed and correct to me.

(4) ptrace allows a parent process to read and write the address space of its child. ptrace ignores the vma protection checks. This is essential for writing breakpoint instructions into read-only code segments, and, well, it's just the way ptrace has always worked.

(5) The ptrace functions that actually get and put data are named get_long and put_long. get_long and put_long make sure that the the pgd, pmd, and pte exist and that the pte has a physical page. get_long and put_long do this by calling handle_mm_fault. This also takes care of copy-on-write copying, so that breakpoints written into read-only code segments go into an unshared copy of the page rather than the original shared copy.

Have I got that right, too?

---

So here is the core problem: the mm machinery in (123) above wants to handle _PAGE_PROTNONE pages without ever physically mapping them. It assumes that a user process can't ever access that page. A user process can't access that page, but its parent process can, with ptrace!!

ptrace's idea about forcing a page-present collides with mm/memory.c's idea that a _PAGE_PROTNONE is already present, even when it is not mapped.

I think that is why your one-line patches keep failing in different obscure ways:

Stock 2.1.122: put_long says "this pte is not present" put_long calls handle_mm_fault handle_mm_fault says "hey this pte is present, it's _PAGE_PROTNONE" handle_mm_fault returns put_long says "this pte is not present", loops and try again

Linux Patch #1: put_long says "ok this pte is mapped, good enough" put_long writes on the page fortunately there is an underlying page (although it's not present) unfortunately this page has not been properly COW'ed, so everybody who shares that page sees what has been written in fact the page is the anonymous read-only zero page, so everybody's bss is messed up as they get demand loaded!

Linux Patch #2: produces another variant of the misunderstanding in Stock 2.1.122

---

I have two ideas for a solution:

#1 change the semantics of ptrace put_long. ptrace get_long has no problem: check for _PAGE_PROTNONE and return 0 specifically for that case (same semantics). ptrace put_long would check for _PAGE_PROTNONE and return -EFAULT or a similar error. I don't like this very much but it does make ptrace conform to the "PROTNONE is never mapped" constraint.

#2 change the constraint so that PROTNONE pages can have physical pages mapped to them. When ptrace is operating on a PROTNONE page, it calls handle_mm_fault, and when it is done, it turns off the present bit so that the page retains its semantics. This is more complicated but it is compatible with 2.0.XX and allows applications to keep playing ptrace games writing onto PROTNONE segments.

Or maybe there is a better solution.

Also, whatever the solution is, I can change my application (it's GPL and it's not popular yet -- it's a trace-and-replay debugger). I don't know about the "gdb efence" case.

What do you think?

Michael Elizabeth Chastain <mailto:mec@shout.net> "love without fear"

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/