On Thu, Nov 12, 2020 at 09:15:18PM +0100, David Hildenbrand wrote:
Am 12.11.2020 um 20:08 schrieb Mike Rapoport <rppt@xxxxxxxxxx>:
On Thu, Nov 12, 2020 at 05:22:00PM +0100, David Hildenbrand wrote:
On 10.11.20 19:06, Mike Rapoport wrote:
On Tue, Nov 10, 2020 at 06:17:26PM +0100, David Hildenbrand wrote:
On 10.11.20 16:14, Mike Rapoport wrote:
From: Mike Rapoport <rppt@xxxxxxxxxxxxx>
It will be used by the upcoming secret memory implementation.
Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxx>
---
mm/internal.h | 3 +++
mm/mmap.c | 5 ++---
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/mm/internal.h b/mm/internal.h
index c43ccdddb0f6..ae146a260b14 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -348,6 +348,9 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
extern void mlock_vma_page(struct page *page);
extern unsigned int munlock_vma_page(struct page *page);
+extern int mlock_future_check(struct mm_struct *mm, unsigned long flags,
+ unsigned long len);
+
/*
* Clear the page's PageMlocked(). This can be useful in a situation where
* we want to unconditionally remove a page from the pagecache -- e.g.,
diff --git a/mm/mmap.c b/mm/mmap.c
index 61f72b09d990..c481f088bd50 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1348,9 +1348,8 @@ static inline unsigned long round_hint_to_min(unsigned long hint)
return hint;
}
-static inline int mlock_future_check(struct mm_struct *mm,
- unsigned long flags,
- unsigned long len)
+int mlock_future_check(struct mm_struct *mm, unsigned long flags,
+ unsigned long len)
{
unsigned long locked, lock_limit;
So, an interesting question is if you actually want to charge secretmem
pages against mlock now, or if you want a dedicated secretmem cgroup
controller instead?
Well, with the current implementation there are three limits an
administrator can use to control secretmem limits: mlock, memcg and
kernel parameter.
The kernel parameter puts a global upper limit for secretmem usage,
memcg accounts all secretmem allocations, including the unused memory in
large pages caching and mlock allows per task limit for secretmem
mappings, well, like mlock does.
I didn't consider a dedicated cgroup, as it seems we already have enough
existing knobs and a new one would be unnecessary.
To me it feels like the mlock() limit is a wrong fit for secretmem. But
maybe there are other cases of using the mlock() limit without actually
doing mlock() that I am not aware of (most probably :) )?
Secretmem does not explicitly calls to mlock() but it does what mlock()
does and a bit more. Citing mlock(2):
mlock(), mlock2(), and mlockall() lock part or all of the calling
process's virtual address space into RAM, preventing that memory from
being paged to the swap area.
So, based on that secretmem pages are not swappable, I think that
RLIMIT_MEMLOCK is appropriate here.
The page explicitly lists mlock() system calls.
Well, it's mlock() man page, isn't it? ;-)
My thinking was that since secretmem does what mlock() does wrt
swapability, it should at least obey the same limit, i.e.
RLIMIT_MEMLOCK.
E.g., we also don‘tDo you mean gigantic pages in hugetlbfs?
account for gigantic pages - which might be allocated from CMA and are
not swappable.
It seems to me that hugetlbfs accounting is a completely different
story.