Re: [PATCH] mm: Free per cpu pages async to shorten program exit time

From: David Hildenbrand
Date: Fri Oct 08 2021 - 08:55:52 EST


On 08.10.21 14:38, Vlastimil Babka wrote:
On 10/8/21 10:17, David Hildenbrand wrote:
On 08.10.21 08:39, ultrachin@xxxxxxx wrote:
From: chen xiaoguang <xiaoggchen@xxxxxxxxxxx>

The exit time is long when program allocated big memory and
the most time consuming part is free memory which takes 99.9%
of the total exit time. By using async free we can save 25% of
exit time.

Signed-off-by: chen xiaoguang <xiaoggchen@xxxxxxxxxxx>
Signed-off-by: zeng jingxiang <linuszeng@xxxxxxxxxxx>
Signed-off-by: lu yihui <yihuilu@xxxxxxxxxxx>

I recently discussed with Claudio if it would be possible to tear down the
process MM deferred, because for some use cases (secure/encrypted
virtualization, very large mmaps) tearing down the page tables is already
the much more expensive operation.

OK, but what exactly is the benefit here? The cpu time will have to be spent
in any case, but we move it to a context that's not accounted to the exiting
process. Is that good? Also if it's a large process and restarts
immediately, allocating all the memory back again, it might not be available
as it's still being freed in the background, leading to a risk of OOM?

One use case I was told is that if you have a large (secure/encrypted) VM and shut it down, it might take quite a long time until you can actually start that very VM again, because tooling assumes that the VM isn't shut down until the process is gone (closed all files, sockets, etc.).

I also discussed the risk of OOM with Claudio. In some cases, we don't care, for example, we could start the VM on a different node in the cluster, or there is sufficient memory available to start it on the same node. But there was the idea to stop the OOM killer from firing as long as there is still some MM getting cleaned up, which would also make sense to some degree.

--
Thanks,

David / dhildenb