Re: [PATCH v17 00/10] mm: introduce memfd_secret system call to create "secret" memory areas

From: David Hildenbrand
Date: Tue Feb 09 2021 - 05:35:39 EST


On 09.02.21 11:23, David Hildenbrand wrote:
A lot of unevictable memory is a concern regardless of CMA/ZONE_MOVABLE.
As I've said it is quite easy to land at the similar situation even with
tmpfs/MAP_ANON|MAP_SHARED on swapless system. Neither of the two is
really uncommon. It would be even worse that those would be allowed to
consume both CMA/ZONE_MOVABLE.

IIRC, tmpfs/MAP_ANON|MAP_SHARED memory
a) Is movable, can land in ZONE_MOVABLE/CMA
b) Can be limited by sizing tmpfs appropriately

AFAIK, what you describe is a problem with memory overcommit, not with zone
imbalances (below). Or what am I missing?

It can be problem for both. If you have just too much of shm (do not
forget about MAP_SHARED|MAP_ANON which is much harder to size from an
admin POV) then migrateability doesn't really help because you need a
free memory to migrate. Without reclaimability this can easily become a
problem. That is why I am saying this is not really a new problem.
Swapless systems are not all that uncommon.

I get your point, it's similar but still different. "no memory in the
system" vs. "plenty of unusable free memory available in the system".

In many setups, memory for user space applications can go to
ZONE_MOVABLE just fine. ZONE_NORMAL etc. can be used for supporting user
space memory (e.g., page tables) and other kernel stuff.

Like, have 4GB of ZONE_MOVABLE with 2GB of ZONE_NORMAL. Have an
application (database) that allocates 4GB of memory. Works just fine.
The zone ratio ends up being a problem for example with many processes
(-> many page tables).

Not being able to put user space memory into the movable zone is a
special case. And we are introducing yet another special case here
(besides vfio, rdma, unmigratable huge pages like gigantic pages).

With plenty of secretmem, looking at /proc/meminfo Total vs. Free can be
a big lie of how your system behaves.

One has to be very careful when relying on CMA or movable zones. This is
definitely worth a comment in the kernel command line parameter
documentation. But this is not a new problem.

I see the following thing worth documenting:

Assume you have a system with 2GB of ZONE_NORMAL/ZONE_DMA and 4GB of
ZONE_MOVABLE/CMA.

Assume you make use of 1.5GB of secretmem. Your system might run into OOM
any time although you still have plenty of memory on ZONE_MOVAVLE (and even
swap!), simply because you are making excessive use of unmovable allocations
(for user space!) in an environment where you should not make excessive use
of unmovable allocations (e.g., where should page tables go?).

yes, you are right of course and I am not really disputing this. But I
would argue that 2:1 Movable/Normal is something to expect problems
already. "Lowmem" allocations can easily trigger OOM even without secret
mem in the picture. It all just takes to allocate a lot of GFP_KERNEL or
even GFP_{HIGH}USER. Really, it is CMA/MOVABLE that are elephant in the
room and one has to be really careful when relying on them.

Right, it's all about what the setup actually needs. Sure, there are
cases where you need significantly more GFP_KERNEL/GFP_{HIGH}USER such
that a 2:1 ratio is not feasible. But I claim that these are corner cases.

Secretmem gives user space the option to allocate a lot of
GFP_{HIGH}USER memory. If I am not wrong, "ulimit -a" tells me that each
application on F33 can allocate 16 GiB (!) of secretmem.

Got to learn to do my math. It's 16 MiB - so as a default it's less dangerous than I thought!

--
Thanks,

David / dhildenb