On Fri, 2011-07-15 at 16:38 +0800, MailingLists wrote:On 07/15/2011 04:20 PM, Peter Zijlstra wrote:I'm fairly sure fault_in_user_writeable() has PF enabled as it takesOn Fri, 2011-07-15 at 16:07 +0800, Shan Hai wrote:A page could be set to read only by the kernel (supervisor in the powerpcThe following test case could reveal a bug in the futex_lock_pi()But but but but, that get_user_pages(.write=1, .force=0) should result
BUG: On FUTEX_LOCK_PI, there is a infinite loop in the futex_lock_pi()
on Powerpc e500 core.
Cause: The linux kernel on the e500 core has no write permission on
the COW page, refer the head comment of the following test code.
ftrace on test case:
[000] 353.990181: futex_lock_pi_atomic<-futex_lock_pi
[000] 353.990185: cmpxchg_futex_value_locked<-futex_lock_pi_atomic
[snip]
[000] 353.990191: do_page_fault<-handle_page_fault
[000] 353.990192: bad_page_fault<-handle_page_fault
[000] 353.990193: search_exception_tables<-bad_page_fault
[snip]
[000] 353.990199: get_user_pages<-fault_in_user_writeable
[snip]
[000] 353.990208: mark_page_accessed<-follow_page
[000] 353.990222: futex_lock_pi_atomic<-futex_lock_pi
[snip]
[000] 353.990230: cmpxchg_futex_value_locked<-futex_lock_pi_atomic
[ a loop occures here ]
in a COW break, getting our own writable page.
What is this e500 thing smoking that this doesn't work?
literature) on the e500, and that's what the kernel do. Set SW(supervisor
write) bit in the TLB entry to grant write permission to the kernel on a
page.
And further the SW bit is set according to the DIRTY flag of the PTE,
PTE.DIRTY is set in the do_page_fault(), the futex_lock_pi() disabled
page fault, the PTE.DIRTY never can be set, so do the SW bit, unbreakable
COW occurred, infinite loop followed.
mmap_sem, an pagefaul_disable() is akin to preemp_disable() on mainline.
Also get_user_pages() fully expects to be able to schedule, and in fact
can call the full pf handler path all by its lonesome self.