Re: [PATCH v8] oom_kill.c: futex: Don't OOM reap the VMA containing the robust_list_head

From: Thomas Gleixner
Date: Fri Apr 08 2022 - 17:41:19 EST


On Fri, Apr 08 2022 at 12:13, Joel Savitz wrote:
>> if (!fork()) {
>> pri = mmap(NULL, 1<<20, PROT_READ | PROT_WRITE,
>> MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>> pthread_mutexattr_init(&mat_p);
>> pthread_mutexattr_setpshared(&mat_p, PTHREAD_PROCESS_PRIVATE);
>> pthread_mutexattr_setrobust(&mat_p, PTHREAD_MUTEX_ROBUST);
> One thing I don't understand is what kind of sane use case relies on
> robust futex for a process-private lock?
> Is there a purpose to a lock being on the robust list if there are no
> other processes that must be woken in case the holder process is
> killed?

Ever heard about the concept of multi threading?

> If this usage serves no purpose besides causing races during oom, we
> should discourage this use, perhaps by adding a note on the manpage.

This usage does not cause races during oom. It does not even cause races
if it would be silly, which it is not except for the demonstrator
above. The keyword here is *demonstrator*.

The oom killer itself causes the race because it starts reaping the VMAs
without granting the target time to terminate. This needs to be fixed in
the first place, period.

If the target can't terminate because it is stuck then yes, there will
be fallout where a robust futex cannot be released, but that's something
which cannot be solved at all.

I'm really tired of this by now. Several people explained in great
length the shortcomings of your 'cure the symptom' approach, showed you
that the "impossible to reproduce" problem is real and told you very
explicitely what the proper solution is.

So instead of sitting down and really tackling the root cause, all you
can come up with is to post the same 'cure the symptom' muck over and
over and then if debunked grasp for straws.

Coming back to your original question.

What's the difference between a process shared and a process private
futex in the context of a multi threaded process?

- The process shared must obviously have a shared mapping

- The process private has no need for a shared mapping because
all threads share the same address space.

What do they have in common?

- All of them are threads in the kernel POV

- All of them care about the unexpected exit/death of some other
thread vs. locking

So why would a process private robust mutex be any different from a
process shared one?

I'm sure you can answer that question yourself by now.

Thanks,

tglx