Re: [2.6.26-rc7] shrink_icache from pagefault locking (nee: nfsd hangs for a few sec)...

From: Mel Gorman
Date: Sun Jun 22 2008 - 14:10:28 EST


On (22/06/08 10:58), Daniel J Blueman didst pronounce:
> I'm seeing a similar issue [2] to what was recently reported [1] by
> Alexander, but with another workload involving XFS and memory
> pressure.
>

Is NFS involved or is this XFS only? It looks like XFS-only but no harm in
being sure.

I'm beginning to wonder if this is a problem where a lot of dirty inodes are
being written back in this path and we stall while that happens. I'm still
not getting why we are triggering this now and did not before 2.6.26-rc1
or why it bisects to the zonelist modifications. Diffing the reclaim and
allocation paths between 2.6.25 and 2.6.26-rc1 has not yielded any candidates
for me yet that would explain this.

> SLUB allocator is in use and config is at http://quora.org/config-client-debug .
>
> Let me know if you'd like more details/vmlinux objdump etc.
>
> Thanks,
> Daniel
>
> --- [1]
>
> http://groups.google.com/group/fa.linux.kernel/browse_thread/thread/e673c9173d45a735/db9213ef39e4e11c
>
> --- [2]
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.26-rc7-210c #2
> -------------------------------------------------------
> AutopanoPro/4470 is trying to acquire lock:
> (iprune_mutex){--..}, at: [<ffffffff802d94fd>] shrink_icache_memory+0x7d/0x290
>
> but task is already holding lock:
> (&mm->mmap_sem){----}, at: [<ffffffff805e3e15>] do_page_fault+0x255/0x890
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 (&mm->mmap_sem){----}:
> [<ffffffff80278f4d>] __lock_acquire+0xbdd/0x1020
> [<ffffffff802793f5>] lock_acquire+0x65/0x90
> [<ffffffff805df5ab>] down_read+0x3b/0x70
> [<ffffffff805e3e3c>] do_page_fault+0x27c/0x890
> [<ffffffff805e16cd>] error_exit+0x0/0xa9
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> -> #1 (&(&ip->i_iolock)->mr_lock){----}:
> [<ffffffff80278f4d>] __lock_acquire+0xbdd/0x1020
> [<ffffffff802793f5>] lock_acquire+0x65/0x90
> [<ffffffff8026d746>] down_write_nested+0x46/0x80
> [<ffffffff8039df29>] xfs_ilock+0x99/0xa0
> [<ffffffff8039e0cf>] xfs_ireclaim+0x3f/0x90
> [<ffffffff803ba889>] xfs_finish_reclaim+0x59/0x1a0
> [<ffffffff803bc199>] xfs_reclaim+0x109/0x110
> [<ffffffff803c9541>] xfs_fs_clear_inode+0xe1/0x110
> [<ffffffff802d906d>] clear_inode+0x7d/0x110
> [<ffffffff802d93aa>] dispose_list+0x2a/0x100
> [<ffffffff802d96af>] shrink_icache_memory+0x22f/0x290
> [<ffffffff8029d868>] shrink_slab+0x168/0x1d0
> [<ffffffff8029e0b6>] kswapd+0x3b6/0x560
> [<ffffffff8026921d>] kthread+0x4d/0x80
> [<ffffffff80227428>] child_rip+0xa/0x12
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> -> #0 (iprune_mutex){--..}:
> [<ffffffff80278db7>] __lock_acquire+0xa47/0x1020
> [<ffffffff802793f5>] lock_acquire+0x65/0x90
> [<ffffffff805dedd5>] mutex_lock_nested+0xb5/0x300
> [<ffffffff802d94fd>] shrink_icache_memory+0x7d/0x290
> [<ffffffff8029d868>] shrink_slab+0x168/0x1d0
> [<ffffffff8029db38>] try_to_free_pages+0x268/0x3a0
> [<ffffffff802979d6>] __alloc_pages_internal+0x206/0x4b0
> [<ffffffff80297c89>] __alloc_pages_nodemask+0x9/0x10
> [<ffffffff802b2bc2>] alloc_page_vma+0x72/0x1b0
> [<ffffffff802a3642>] handle_mm_fault+0x462/0x7b0
> [<ffffffff805e3ecc>] do_page_fault+0x30c/0x890
> [<ffffffff805e16cd>] error_exit+0x0/0xa9
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> other info that might help us debug this:
>
> 2 locks held by AutopanoPro/4470:
> #0: (&mm->mmap_sem){----}, at: [<ffffffff805e3e15>] do_page_fault+0x255/0x890
> #1: (shrinker_rwsem){----}, at: [<ffffffff8029d732>] shrink_slab+0x32/0x1d0
>
> stack backtrace:
> Pid: 4470, comm: AutopanoPro Not tainted 2.6.26-rc7-210c #2
>
> Call Trace:
> [<ffffffff80276823>] print_circular_bug_tail+0x83/0x90
> [<ffffffff80275e09>] ? print_circular_bug_entry+0x49/0x60
> [<ffffffff80278db7>] __lock_acquire+0xa47/0x1020
> [<ffffffff802793f5>] lock_acquire+0x65/0x90
> [<ffffffff802d94fd>] ? shrink_icache_memory+0x7d/0x290
> [<ffffffff805dedd5>] mutex_lock_nested+0xb5/0x300
> [<ffffffff802d94fd>] ? shrink_icache_memory+0x7d/0x290
> [<ffffffff802d94fd>] shrink_icache_memory+0x7d/0x290
> [<ffffffff8029d732>] ? shrink_slab+0x32/0x1d0
> [<ffffffff8029d868>] shrink_slab+0x168/0x1d0
> [<ffffffff8029db38>] try_to_free_pages+0x268/0x3a0
> [<ffffffff8029c240>] ? isolate_pages_global+0x0/0x40
> [<ffffffff802979d6>] __alloc_pages_internal+0x206/0x4b0
> [<ffffffff80297c89>] __alloc_pages_nodemask+0x9/0x10
> [<ffffffff802b2bc2>] alloc_page_vma+0x72/0x1b0
> [<ffffffff802a3642>] handle_mm_fault+0x462/0x7b0
> [<ffffffff80277e2f>] ? trace_hardirqs_on+0xbf/0x150
> [<ffffffff805e3e15>] ? do_page_fault+0x255/0x890
> [<ffffffff805e3ecc>] do_page_fault+0x30c/0x890
> [<ffffffff805e16cd>] error_exit+0x0/0xa9
> --
> Daniel J Blueman
>

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/