Re: [PATCH 11/15] ptwalk: copy_pte_range hang

From: Nick Piggin
Date: Wed Mar 09 2005 - 18:44:45 EST

Hugh Dickins wrote:
This patch is the odd-one-out of the sequence. The one before adjusted
copy_pte_range from a for loop to a do while loop, and it was therefore
simplest to check for lockbreak before copying pte: possibility that it
might keep getting preempted without making progress under some loads.

Some loads such as startup: 2*HT*P4 with preemption cannot even reach
multiuser login. Suspect needs_lockbreak is broken, can get in a state
when it remains forever true. Investigate that later: for now, and for
all time, it makes sense to aim for a little progress before breaking
out; and we can manage more pte_nones than copies.

(Just to reiterate a private mail sent to Hugh earlier)

Yeah I think lockbreak is broken. Because the inner spinlock never
has a cond_resched_lock performed on it, so its break_lock is
never set to 0, but need_lockbreak still always returns 1 for it.

IMO, spin_lock should set break_lock to 0, then cond_resched_lock
need not bother with it.

