Re: [PATCH 0/9] Avoid populating unbounded num of ptes with mmap_sem held

From: Michel Lespinasse
Date: Fri Jan 04 2013 - 17:58:00 EST

On Fri, Jan 4, 2013 at 10:16 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> I still have quite a few instances of 2-6 ms of latency due to
> "call_rwsem_down_read_failed __do_page_fault do_page_fault
> page_fault". Any idea why? I don't know any great way to figure out
> who is holding mmap_sem at the time. Given what my code is doing, I
> suspect the contention is due to mmap or munmap on a file. MCL_FUTURE
> is set, and MAP_POPULATE is not set.
> It could be the other thread calling mmap and getting preempted (or
> otherwise calling schedule()). Grr.

The simplest way to find out who's holding the lock too long might be
to enable CONFIG_LOCK_STATS. This will slow things down a little, but
give you lots of useful information including which threads hold
mmap_sem the longest and the call stack for where they grab it from.
See Documentation/lockstat.txt

I think munmap is a likely culprit, as it still happens with mmap_sem
held for write (I do plan to go work on this next). But it's hard to
be sure without lockstats :)

Michel "Walken" Lespinasse
