Re: [PATCH v2 1/2] mm:vmscan: the dirty folio in folio_list skip unmap

From: zhiguojiang
Date: Mon Oct 23 2023 - 04:07:46 EST




在 2023/10/20 12:15, Matthew Wilcox 写道:
On Fri, Oct 20, 2023 at 11:59:33AM +0800, zhiguojiang wrote:
@@ -1261,43 +1305,6 @@ static unsigned int shrink_folio_list(struct
list_head *folio_list,
                      enum ttu_flags flags = TTU_BATCH_FLUSH;
                      bool was_swapbacked =
folio_test_swapbacked(folio);

-                     if (folio_test_dirty(folio)) {
-                             /*
-                              * Only kswapd can writeback
filesystem folios
-                              * to avoid risk of stack overflow.
But avoid
-                              * injecting inefficient single-folio
I/O into
-                              * flusher writeback as much as
possible: only
-                              * write folios when we've encountered
many
-                              * dirty folios, and when we've
already scanned
-                              * the rest of the LRU for clean
folios and see
-                              * the same dirty folios again (with
the reclaim
-                              * flag set).
-                              */
-                             if (folio_is_file_lru(folio) &&
-                                     (!current_is_kswapd() ||
- !folio_test_reclaim(folio) ||
-                                      !test_bit(PGDAT_DIRTY,
&pgdat->flags))) {
-                                     /*
-                                      * Immediately reclaim when
written back.
-                                      * Similar in principle to
folio_deactivate()
-                                      * except we already have the
folio isolated
-                                      * and know it's dirty
-                                      */
-                                     node_stat_mod_folio(folio,
NR_VMSCAN_IMMEDIATE,
-                                                     nr_pages);
-                                     folio_set_reclaim(folio);
-
-                                     goto activate_locked;
-                             }
-
-                             if (references == FOLIOREF_RECLAIM_CLEAN)
-                                     goto keep_locked;
-                             if (!may_enter_fs(folio, sc->gfp_mask))
-                                     goto keep_locked;
-                             if (!sc->may_writepage)
-                                     goto keep_locked;
-                     }
-
                      if (folio_test_pmd_mappable(folio))
                              flags |= TTU_SPLIT_HUGE_PMD;

I'm confused. Did you apply this on top of v1 by accident?
Hi,
According to my modified mm_vmscan_lru_shrink_inactive test tracelog, in the
You're missing David's point. You've generated this patch against ...
something ... that isn't upstream. Probably against v1 of your
patch. Please check your git tree.

32 scanned inactive file pages, 20 were dirty, and the 20 dirty pages were
not reclamed, but they took 20us to perform try_to_unmap.

I think unreclaimed dirty folio in inactive file lru can skip to perform
try_to_unmap. Please help to continue review. Thanks.

kswapd0-99      (     99) [005] .....   687.793724:
mm_vmscan_lru_shrink_inactive: [Justin] nid 0 scan=32 isolate=32 reclamed=12
nr_dirty=20 nr_unqueued_dirty=20 nr_writeback=0 nr_congested=0
nr_immediate=0 nr_activate[0]=0 nr_activate[1]=20 nr_ref_keep=0
nr_unmap_fail=0 priority=2 file=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total=39
exe=0 reference_cost=5 reference_exe=0 unmap_cost=21 unmap_exe=0
dirty_unmap_cost=20 dirty_unmap_exe=0 pageout_cost=0 pageout_exe=0
Are you seeing measurable changes for any workloads? It certainly seems
like you should, but it would help if you chose a test from mmtests and
showed how performance changed on your system.
In one mmtest, the max times for a invalid recyling of a folio_list dirty folio that does not support pageout and has been activated in shrink_folio_list() are: cost=51us, exe=2365us.

Calculate according to this formula: dirty_cost / total_cost * 100%, the recyling efficiency of dirty folios can be improved 53.13%、82.95%.

So this patch can optimize shrink efficiency and reduce the workload of kswapd to a certain extent.

kswapd0-96      (     96) [005] .....   387.218548: mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 nr_reclaimed 31 nr_dirty  1 nr_unqueued_dirty  1 nr_writeback 0 nr_activate[1]  1 nr_ref_keep  0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365

kswapd0-96      (     96) [006] .....   412.822532: mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32 nr_reclaimed  0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0 nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total_cost 88 total_exe 605  dirty_cost 73 total_exe 605