Re: [RFC v2 PATCH 2/2] mm: mmap: zap pages with read mmap_sem for large mapping
From: Nadav Amit
Date: Tue Jun 19 2018 - 18:17:27 EST
at 4:34 PM, Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx> wrote:
> When running some mmap/munmap scalability tests with large memory (i.e.
>> 300GB), the below hung task issue may happen occasionally.
>
> INFO: task ps:14018 blocked for more than 120 seconds.
> Tainted: G E 4.9.79-009.ali3000.alios7.x86_64 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> message.
> ps D 0 14018 1 0x00000004
>
(snip)
>
> Zapping pages is the most time consuming part, according to the
> suggestion from Michal Hock [1], zapping pages can be done with holding
> read mmap_sem, like what MADV_DONTNEED does. Then re-acquire write
> mmap_sem to manipulate vmas.
Does munmap() == MADV_DONTNEED + munmap() ?
For example, what happens with userfaultfd in this case? Can you get an
extra #PF, which would be visible to userspace, before the munmap is
finished?
In addition, would it be ok for the user to potentially get a zeroed page in
the time window after the MADV_DONTNEED finished removing a PTE and before
the munmap() is done?
Regards,
Nadav