Re: [PATCH 2/2] mlock use lru_add_drain_all_async()

From: Ying Han
Date: Tue Oct 06 2009 - 19:01:09 EST


Hello KOSAKI-san,

Few questions on the lru_add_drain_all_async(). If i understand
correctly, the reason that we have lru_add_drain_all() in the mlock()
call is to isolate mlocked pages into the separate LRU in case they
are sitting in pagevec.

And I also understand the RT use cases you put in the patch
description, now my questions is that do we have race after applying
the patch? For example that if the RT task not giving up the cpu by
the time mlock returns, you have pages left in the pagevec which not
being drained back to the lru list. Do we have problem with that?

--Ying

On Mon, Oct 5, 2009 at 7:41 PM, KOSAKI Motohiro
<kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>
> Recently, Peter Zijlstra reported RT-task can lead to prevent mlock
> very long time.
>
>  Suppose you have 2 cpus, cpu1 is busy doing a SCHED_FIFO-99 while(1),
>  cpu0 does mlock()->lru_add_drain_all(), which does
>  schedule_on_each_cpu(), which then waits for all cpus to complete the
>  work. Except that cpu1, which is busy with the RT task, will never run
>  keventd until the RT load goes away.
>
>  This is not so much an actual deadlock as a serious starvation case.
>
> Actually, mlock() doesn't need to wait to finish lru_add_drain_all().
> Thus, this patch replace it with lru_add_drain_all_async().
>
> Cc: Oleg Nesterov <onestero@xxxxxxxxxx>
> Reported-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
> ---
>  mm/mlock.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 22041aa..46a016f 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -458,7 +458,7 @@ SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
>        if (!can_do_mlock())
>                return -EPERM;
>
> -       lru_add_drain_all();    /* flush pagevec */
> +       lru_add_drain_all_async();      /* flush pagevec */
>
>        down_write(&current->mm->mmap_sem);
>        len = PAGE_ALIGN(len + (start & ~PAGE_MASK));
> @@ -526,7 +526,7 @@ SYSCALL_DEFINE1(mlockall, int, flags)
>        if (!can_do_mlock())
>                goto out;
>
> -       lru_add_drain_all();    /* flush pagevec */
> +       lru_add_drain_all_async();      /* flush pagevec */
>
>        down_write(&current->mm->mmap_sem);
>
> --
> 1.6.2.5
>
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/