Re: [RFC PATCH 0/6] Do not call ->writepage[s] from direct reclaimand use a_ops->writepages() where possible
From: Nick Piggin
Date: Tue Jun 15 2010 - 11:08:14 EST
On Tue, Jun 15, 2010 at 03:51:34PM +0100, Mel Gorman wrote:
> On Tue, Jun 15, 2010 at 04:00:11PM +0200, Andrea Arcangeli wrote:
> > When memory pressure is low, not going into ->writepage may be
> > beneficial from latency prospective too. (but again it depends how
> > much it matters to go in LRU and how beneficial is the cache, to know
> > if it's worth taking clean cache away even if hotter than dirty cache)
> >
> > About the stack overflow did you ever got any stack-debug error?
>
> Not an error. Got a report from Dave Chinner though and it's what kicked
> off this whole routine in the first place. I've been recording stack
> usage figures but not reporting them. In reclaim I'm getting to about 5K
> deep but this was on simple storage and XFS was ignoring attempts for
> reclaim to writeback.
>
> http://lkml.org/lkml/2010/4/13/121
>
> Here is one my my own stack traces though
>
> Depth Size Location (49 entries)
> ----- ---- --------
> 0) 5064 304 get_page_from_freelist+0x2e4/0x722
> 1) 4760 240 __alloc_pages_nodemask+0x15f/0x6a7
> 2) 4520 48 kmem_getpages+0x61/0x12c
> 3) 4472 96 cache_grow+0xca/0x272
> 4) 4376 80 cache_alloc_refill+0x1d4/0x226
> 5) 4296 64 kmem_cache_alloc+0x129/0x1bc
> 6) 4232 16 mempool_alloc_slab+0x16/0x18
> 7) 4216 144 mempool_alloc+0x56/0x104
> 8) 4072 16 scsi_sg_alloc+0x48/0x4a [scsi_mod]
> 9) 4056 96 __sg_alloc_table+0x58/0xf8
> 10) 3960 32 scsi_init_sgtable+0x37/0x8f [scsi_mod]
> 11) 3928 32 scsi_init_io+0x24/0xce [scsi_mod]
> 12) 3896 48 scsi_setup_fs_cmnd+0xbc/0xc4 [scsi_mod]
> 13) 3848 144 sd_prep_fn+0x1d3/0xc13 [sd_mod]
> 14) 3704 64 blk_peek_request+0xe2/0x1a6
> 15) 3640 96 scsi_request_fn+0x87/0x522 [scsi_mod]
> 16) 3544 32 __blk_run_queue+0x88/0x14b
> 17) 3512 48 elv_insert+0xb7/0x254
> 18) 3464 48 __elv_add_request+0x9f/0xa7
> 19) 3416 128 __make_request+0x3f4/0x476
> 20) 3288 192 generic_make_request+0x332/0x3a4
> 21) 3096 64 submit_bio+0xc4/0xcd
> 22) 3032 80 _xfs_buf_ioapply+0x222/0x252 [xfs]
> 23) 2952 48 xfs_buf_iorequest+0x84/0xa1 [xfs]
> 24) 2904 32 xlog_bdstrat+0x47/0x4d [xfs]
> 25) 2872 64 xlog_sync+0x21a/0x329 [xfs]
> 26) 2808 48 xlog_state_release_iclog+0x9b/0xa8 [xfs]
> 27) 2760 176 xlog_write+0x356/0x506 [xfs]
> 28) 2584 96 xfs_log_write+0x5a/0x86 [xfs]
> 29) 2488 368 xfs_trans_commit_iclog+0x165/0x2c3 [xfs]
> 30) 2120 80 _xfs_trans_commit+0xd8/0x20d [xfs]
> 31) 2040 240 xfs_iomap_write_allocate+0x247/0x336 [xfs]
> 32) 1800 144 xfs_iomap+0x31a/0x345 [xfs]
> 33) 1656 48 xfs_map_blocks+0x3c/0x40 [xfs]
> 34) 1608 256 xfs_page_state_convert+0x2c4/0x597 [xfs]
> 35) 1352 64 xfs_vm_writepage+0xf5/0x12f [xfs]
> 36) 1288 32 __writepage+0x17/0x34
> 37) 1256 288 write_cache_pages+0x1f3/0x2f8
> 38) 968 16 generic_writepages+0x24/0x2a
> 39) 952 64 xfs_vm_writepages+0x4f/0x5c [xfs]
> 40) 888 16 do_writepages+0x21/0x2a
> 41) 872 48 writeback_single_inode+0xd8/0x2f4
> 42) 824 112 writeback_inodes_wb+0x41a/0x51e
> 43) 712 176 wb_writeback+0x13d/0x1b7
> 44) 536 128 wb_do_writeback+0x150/0x167
> 45) 408 80 bdi_writeback_task+0x43/0x117
> 46) 328 48 bdi_start_fn+0x76/0xd5
> 47) 280 96 kthread+0x82/0x8a
> 48) 184 184 kernel_thread_helper+0x4/0x10
>
> XFS as you can see is quite deep there. Now consider if
> get_page_from_freelist() there had entered direct reclaim and then tried
> to writeback a page. That's the problem that is being worried about.
It would be a problem because it should be !__GFP_IO at that point so
something would be seriously broken if it called ->writepage again.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/