Re: [PATCH 1/2] sched/wait: Break up long wake list walk

From: Linus Torvalds
Date: Tue Aug 22 2017 - 16:42:19 EST


On Tue, Aug 22, 2017 at 12:55 PM, Liang, Kan <kan.liang@xxxxxxxxx> wrote:
>
>> So I propose testing the attached trivial patch.
>
> It doesnât work.
> The call stack is the same.

So I would have expected the stack trace to be the same, and I would
even expect the CPU usage to be fairly similar, because you'd see
repeating from the callers (taking the fault again if the page is -
once again - being migrated).

But I was hoping that the wait queues would be shorter because the
loop for the retry would be bigger.

Oh well.

I'm slightly out of ideas. Apparently the yield() worked ok (apart
from not catching all cases), and maybe we could do a version that
waits on the page bit in the non-contended case, but yields under
contention?

IOW, maybe this is the best we can do for now? Introducing that
"wait_on_page_migration()" helper might allow us to tweak this a bit
as people come up with better ideas..

And then add Tim's patch for the general worst-case just in case?

Linus
include/linux/pagemap.h | 7 +++++++
mm/filemap.c | 9 +++++++++
mm/huge_memory.c | 2 +-
mm/migrate.c | 2 +-
4 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 79b36f57c3ba..d0451f2501ba 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -503,6 +503,7 @@ static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm,
*/
extern void wait_on_page_bit(struct page *page, int bit_nr);
extern int wait_on_page_bit_killable(struct page *page, int bit_nr);
+extern void wait_on_page_bit_or_yield(struct page *page, int bit_nr);

/*
* Wait for a page to be unlocked.
@@ -524,6 +525,12 @@ static inline int wait_on_page_locked_killable(struct page *page)
return wait_on_page_bit_killable(compound_head(page), PG_locked);
}

+static inline void wait_on_page_migration(struct page *page)
+{
+ if (PageLocked(page))
+ wait_on_page_bit_or_yield(compound_head(page), PG_locked);
+}
+
/*
* Wait for a page to complete writeback
*/
diff --git a/mm/filemap.c b/mm/filemap.c
index a49702445ce0..9e34e7502cac 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1026,6 +1026,15 @@ int wait_on_page_bit_killable(struct page *page, int bit_nr)
return wait_on_page_bit_common(q, page, bit_nr, TASK_KILLABLE, false);
}

+void wait_on_page_bit_or_yield(struct page *page, int bit_nr)
+{
+ if (PageWaiters(page)) {
+ yield();
+ return;
+ }
+ wait_on_page_bit(page, bit_nr);
+}
+
/**
* add_page_wait_queue - Add an arbitrary waiter to a page's wait queue
* @page: Page defining the wait queue of interest
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 90731e3b7e58..d94e89ca9f0c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1443,7 +1443,7 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd)
if (!get_page_unless_zero(page))
goto out_unlock;
spin_unlock(vmf->ptl);
- wait_on_page_locked(page);
+ wait_on_page_migration(page);
put_page(page);
goto out;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index e84eeb4e4356..f0aa68f775aa 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -308,7 +308,7 @@ void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep,
if (!get_page_unless_zero(page))
goto out;
pte_unmap_unlock(ptep, ptl);
- wait_on_page_locked(page);
+ wait_on_page_migration(page);
put_page(page);
return;
out: