Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process

From: Barry Song
Date: Mon Jul 08 2024 - 17:35:01 EST


On Tue, Jul 9, 2024 at 1:11 AM zhiguojiang <justinjiang@xxxxxxxx> wrote:
>
>
>
> 在 2024/7/8 20:41, Barry Song 写道:
> >
> >
> > zhiguojiang <justinjiang@xxxxxxxx> 于 2024年7月9日周二 00:25写道:
> >
> >
> >
> > 在 2024/7/8 20:17, zhiguojiang 写道:
> > >
> > >
> > > 在 2024/7/8 19:02, Barry Song 写道:
> > >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@xxxxxxxx>
> > >> wrote:
> > >>> The releasing process of the non-shared anonymous folio mapped
> > >>> solely by
> > >>> an exiting process may go through two flows: 1) the anonymous
> > folio is
> > >>> firstly is swaped-out into swapspace and transformed into a
> > swp_entry
> > >>> in shrink_folio_list; 2) then the swp_entry is released in the
> > process
> > >>> exiting flow. This will increase the cpu load of releasing a
> > non-shared
> > >>> anonymous folio mapped solely by an exiting process, because
> > the folio
> > >>> go through swap-out and the releasing the swapspace and swp_entry.
> > >>>
> > >>> When system is low memory, it is more likely to occur, because
> > more
> > >>> backend applidatuions will be killed.
> > >>>
> > >>> The modification is that shrink skips the non-shared anonymous
> > folio
> > >>> solely mapped by an exting process and the folio is only released
> > >>> directly in the process exiting flow, which will save swap-out
> > time
> > >>> and alleviate the load of the process exiting.
> > >>>
> > >>> Signed-off-by: Zhiguo Jiang <justinjiang@xxxxxxxx>
> > >>> ---
> > >>>
> > >>> Change log:
> > >>> v4->v5:
> > >>> 1.Modify to skip non-shared anonymous folio only.
> > >>> 2.Update comments for pra->referenced = -1.
> > >>> v3->v4:
> > >>> 1.Modify that the unshared folios mapped only in exiting task
> > are skip.
> > >>> v2->v3:
> > >>> Nothing.
> > >>> v1->v2:
> > >>> 1.The VM_EXITING added in v1 patch is removed, because it will
> > fail
> > >>> to compile in 32-bit system.
> > >>>
> > >>> mm/rmap.c | 13 +++++++++++++
> > >>> mm/vmscan.c | 7 ++++++-
> > >>> 2 files changed, 19 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/mm/rmap.c b/mm/rmap.c
> > >>> index 26806b49a86f..5b5281d71dbb
> > >>> --- a/mm/rmap.c
> > >>> +++ b/mm/rmap.c
> > >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct
> > folio
> > >>> *folio,
> > >>> int referenced = 0;
> > >>> unsigned long start = address, ptes = 0;
> > >>>
> > >>> + /*
> > >>> + * Skip the non-shared anonymous folio mapped solely by
> > >>> + * the single exiting process, and release it directly
> > >>> + * in the process exiting.
> > >>> + */
> > >>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> > >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
> > >>> + folio_test_anon(folio) &&
> > >>> folio_test_swapbacked(folio) &&
> > >>> + !folio_likely_mapped_shared(folio)) {
> > >>> + pra->referenced = -1;
> > >>> + return false;
> > >>> + }
> > >>> +
> > >>> while (page_vma_mapped_walk(&pvmw)) {
> > >>> address = pvmw.address;
> > > Sure, I agree with your modification suggestions. This way,
> > using PTL
> > > indeed sure
> > > that the folio is mapped by this process.
> > > Thanks
> > >> As David suggested, what about the below?
> > >>
> > >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio
> > >> *folio,
> > >> continue;
> > >> }
> > >>
> > >> + /*
> > >> + * Skip the non-shared anonymous folio mapped
> > solely by
> > >> + * the single exiting process, and release it
> > directly
> > >> + * in the process exiting.
> > >> + */
> > >> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> > >> + test_bit(MMF_OOM_SKIP,
> > >> &vma->vm_mm->flags)) &&
> > >> + folio_test_anon(folio) &&
> > >> folio_test_swapbacked(folio) &&
> > >> + !folio_likely_mapped_shared(folio)) {
> > >> + pra->referenced = -1;
> > >> + page_vma_mapped_walk_done(&pvmw);
> > >> + return false;
> > >> + }
> > >> +
> > >> if (pvmw.pte) {
> > >> if (lru_gen_enabled() &&
> > >> pte_young(ptep_get(pvmw.pte))) {
> > >>
> > >>
> > >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP,
> > >> &vma->vm_mm->flags) is
> > >> correct (I think it is wrong). For example, global_init can
> > >> directly have it:
> > >> if (is_global_init(p)) {
> > >> can_oom_reap = false;
> > >> set_bit(MMF_OOM_SKIP, &mm->flags);
> > >> pr_info("oom killer %d (%s) has mm
> > pinned by
> > >> %d (%s)\n",
> > >> task_pid_nr(victim),
> > >> victim->comm,
> > >> task_pid_nr(p), p->comm);
> > >> continue;
> > >> }
> > >>
> > >> And exit_mmap() automatically has MMF_OOM_SKIP.
> > >>
> > >> What is the purpose of this check? Is there a better way to
> > determine
> > >> if a process is an
> > >> OOM target? What about check_stable_address_space() ?
> > > 1.Sorry, I overlook the situation with if (is_global_init(p)),
> > > MMF_OOM_SKIP is indeed not suitable.
> > >
> > > 2.check_stable_address_space() can indicate oom_reaper, but it
> > seems
> > > unable to identify the situation where the process exits normally.
> > > What about task_is_dying()? static inline bool
> > task_is_dying(void) {
> > > return tsk_is_oom_victim(current) ||
> > fatal_signal_pending(current) ||
> > > (current->flags & PF_EXITING); } Thanks
> > We can migrate task_is_dying() from mm/memcontrol.c to
> > include/linux/oom.h
> > > static inline bool task_is_dying(void)
> > > {
> > > return tsk_is_oom_victim(current) ||
> > fatal_signal_pending(current) ||
> > > (current->flags & PF_EXITING);
> > > }
> >
> >
> > no. current is kswapd.
> Hi Barry,
>
> It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP.
> check_stable_address_space() can indicate oom kill, and
> !atomic_read(&vma->vm_mm->mm_users)
> can indicate the normal process exiting.
>
> /*
> * Skip the non-shared anonymous folio mapped solely by
> * the single exiting process, and release it directly
> * in the process exiting.
> */
> if ((!atomic_read(&vma->vm_mm->mm_users) ||
> check_stable_address_space(vma->vm_mm)) &&
> folio_test_anon(folio) && folio_test_swapbacked(folio) &&
> !folio_likely_mapped_shared(folio)) {
> pra->referenced = -1;
> page_vma_mapped_walk_done(&pvmw);
> return false;
> }
>

Yes, + David, Willy (when you send a new version, please CC people who have
participated and describe how you have addressed comments from those
people.)

I also think we actually can remove "folio_test_anon(folio)".

So It could be,

@@ -883,6 +871,21 @@ static bool folio_referenced_one(struct folio *folio,
continue;
}

+ /*
+ * Skip the non-shared swapbacked folio mapped solely by
+ * the exiting or OOM-reaped process. This avoids redundant
+ * swap-out followed by an immediate unmap.
+ */
+ if ((!atomic_read(&vma->vm_mm->mm_users) ||
+ check_stable_address_space(vma->vm_mm)) &&
+ folio_test_swapbacked(folio) &&
+ !folio_likely_mapped_shared(folio)) {
+ pra->referenced = -1;
+ page_vma_mapped_walk_done(&pvmw);
+ return false;
+ }
+
if (pvmw.pte) {
if (lru_gen_enabled() &&
pte_young(ptep_get(pvmw.pte))) {

> Thanks
> Zhiguo
> >
> >
> > >>
> > >>
> > >>> diff --git a/mm/vmscan.c b/mm/vmscan.c
> > >>> index 0761f91b407f..bae7a8bf6b3d
> > >>> --- a/mm/vmscan.c
> > >>> +++ b/mm/vmscan.c
> > >>> @@ -863,7 +863,12 @@ static enum folio_references
> > >>> folio_check_references(struct folio *folio,
> > >>> if (vm_flags & VM_LOCKED)
> > >>> return FOLIOREF_ACTIVATE;
> > >>>
> > >>> - /* rmap lock contention: rotate */
> > >>> + /*
> > >>> + * There are two cases to consider.
> > >>> + * 1) Rmap lock contention: rotate.
> > >>> + * 2) Skip the non-shared anonymous folio mapped solely by
> > >>> + * the single exiting process.
> > >>> + */
> > >>> if (referenced_ptes == -1)
> > >>> return FOLIOREF_KEEP;
> > >>>
> > >>> --
> > >>> 2.39.0
> > >>>
> > >> Thanks
> > >> Barry
> > >
> >
>

Thanks
Barry