Re: [PATCH] Increase default MLOCK_LIMIT to 8 MiB

From: David Hildenbrand
Date: Mon Nov 22 2021 - 15:08:56 EST


On 22.11.21 20:53, Jens Axboe wrote:
> On 11/22/21 11:26 AM, David Hildenbrand wrote:
>> On 22.11.21 18:55, Andrew Dona-Couch wrote:
>>> Forgive me for jumping in to an already overburdened thread. But can
>>> someone pushing back on this clearly explain the issue with applying
>>> this patch?
>>
>> It will allow unprivileged users to easily and even "accidentally"
>> allocate more unmovable memory than it should in some environments. Such
>> limits exist for a reason. And there are ways for admins/distros to
>> tweak these limits if they know what they are doing.
>
> But that's entirely the point, the cases where this change is needed are
> already screwed by a distro and the user is the administrator. This is
> _exactly_ the case where things should just work out of the box. If
> you're managing farms of servers, yeah you have competent administration
> and you can be expected to tweak settings to get the best experience and
> performance, but the kernel should provide a sane default. 64K isn't a
> sane default.

0.1% of RAM isn't either.

>
>> This is not a step into the right direction. This is all just trying to
>> hide the fact that we're exposing FOLL_LONGTERM usage to random
>> unprivileged users.
>>
>> Maybe we could instead try getting rid of FOLL_LONGTERM usage and the
>> memlock limit in io_uring altogether, for example, by using mmu
>> notifiers. But I'm no expert on the io_uring code.
>
> You can't use mmu notifiers without impacting the fast path. This isn't
> just about io_uring, there are other users of memlock right now (like
> bpf) which just makes it even worse.

1) Do we have a performance evaluation? Did someone try and come up with
a conclusion how bad it would be?

2) Could be provide a mmu variant to ordinary users that's just good
enough but maybe not as fast as what we have today? And limit
FOLL_LONGTERM to special, privileged users?

3) Just because there are other memlock users is not an excuse. For
example, VFIO/VDPA have to use it for a reason, because there is no way
not do use FOLL_LONGTERM.

>
> We should just make this 0.1% of RAM (min(0.1% ram, 64KB)) or something
> like what was suggested, if that will help move things forward. IMHO the
> 32MB machine is mostly a theoretical case, but whatever .

1) I'm deeply concerned about large ZONE_MOVABLE and MIGRATE_CMA ranges
where FOLL_LONGTERM cannot be used, as that memory is not available.

2) With 0.1% RAM it's sufficient to start 1000 processes to break any
system completely and deeply mess up the MM. Oh my.


No, I don't like this, absolutely not. I neither like raising the
memlock limit as default to such high values nor using FOLL_LONGTERM in
cases where it could be avoided for random, unprivileged users.

But I assume this is mostly for the records, because I assume nobody
cares about my opinion here.

--
Thanks,

David / dhildenb