Re: [PATCH 03/13] mm: shmem: provide oom badness for shmem files

From: Christian König
Date: Mon Jun 13 2022 - 10:46:47 EST


Am 13.06.22 um 09:45 schrieb Michal Hocko:
On Sat 11-06-22 10:06:18, Christian König wrote:
Am 10.06.22 um 16:16 schrieb Michal Hocko:
[...]
I could of course add something to struct page to track which memcg (or
process) it was charged against, but extending struct page is most likely a
no-go.
Struct page already maintains is memcg. The one which has charged it and
it will stay constatnt throughout of the allocation lifetime (cgroup v1
has a concept of the charge migration but this hasn't been adopted in
v2).

We have a concept of active_memcg which allows to charge against a
different memcg than the allocating context. From your example above I
do not think this is really usable for the described usecase as the X is
not aware where the request comes from?

Well X/Wayland is aware, but not the underlying kernel drivers.

When X/Wayland would want to forward this information to the kernel we would need to extend the existing UAPI quite a bit. And that of course doesn't help us at all with existing desktops.

Alternative I could try to track the "owner" of a buffer (e.g. a shmem
file), but then it can happen that one processes creates the object and
another one is writing to it and actually allocating the memory.
If you can enforce that the owner is really responsible for the
allocation then all should be fine. That would require MAP_POPULATE like
semantic and I suspect this is not really feasible with the existing
userspace. It would be certainly hard to enforce for bad players.

I've tried this today and the result was: "BUG: Bad rss-counter state mm:000000008751d9ff type:MM_FILEPAGES val:-571286".

The problem is once more that files are not informed when the process clones. So what happened is that somebody called fork() with an mm_struct I've accounted my pages to. The result is just that we messed up the rss_stats and  the the "BUG..." above.

The key difference between normal allocated pages and the resources here is just that we are not bound to an mm_struct in any way.

I could just potentially add a dummy VMA to the mm_struct, but to be honest I think that this would just be an absolutely hack.

So I'm running out of ideas how to fix this, except for adding this per file oom badness like I proposed.

Regards,
Christian.