On Fri, 2011-07-15 at 17:08 +0800, Shan Hai wrote:The whole scenario should be,but that's a bug right there, gup(.write=1) _should_ be a complete write
- the child process triggers a page fault at the first time access to
the lock, and it got its own writable page, but its *clean* for
the reason just for checking the status of the lock.
I am sorry for above "unbreakable COW".
- the futex_lock_pi() is invoked because of the lock contention,
and the futex_atomic_cmpxchg_inatomic() tries to get the lock,
it found out the lock is free so tries to write to the lock for
reservation, a page fault occurs, because the page is read only
for kernel(e500 specific), and returns -EFAULT to the caller
- the fault_in_user_writeable() tries to fix the fault,
but from the get_user_pages() view everything is ok, because
the COW was already broken, retry futex_lock_pi_atomic()
fault, and as such toggle your sw dirty/young tracking.
- futex_lock_pi_atomic() --> futex_atomic_cmpxchg_inatomic(),
another write protection page fault
- infinite loop