Re: threads-max observe limits

From: Michal Hocko
Date: Tue Sep 17 2019 - 11:53:55 EST


On Tue 17-09-19 17:28:02, Heinrich Schuchardt wrote:
>
> On 9/17/19 12:03 PM, Michal Hocko wrote:
> > Hi,
> > I have just stumbled over 16db3d3f1170 ("kernel/sysctl.c: threads-max
> > observe limits") and I am really wondering what is the motivation behind
> > the patch. We've had a customer noticing the threads_max autoscaling
> > differences btween 3.12 and 4.4 kernels and wanted to override the auto
> > tuning from the userspace, just to find out that this is not possible.
>
> set_max_threads() sets the upper limit (max_threads_suggested) for
> threads such that at a maximum 1/8th of the total memory can be occupied
> by the thread's administrative data (of size THREADS_SIZE). On my 32 GiB
> system this results in 254313 threads.

This is quite arbitrary, isn't it? What would happen if the limit was
twice as large?

> With patch 16db3d3f1170 ("kernel/sysctl.c: threads-max observe limits")
> a user cannot set an arbitrarily high number for
> /proc/sys/kernel/threads-max which could lead to a system stalling
> because the thread headers occupy all the memory.

This is still a decision of the admin to make. You can consume the
memory by other means and that is why we have measures in place. E.g.
memcg accounting.

> When developing the patch I remarked that on a system where memory is
> installed dynamically it might be a good idea to recalculate this limit.
> If you have a system that boots with let's say 8 GiB and than
> dynamically installs a few TiB of RAM this might make sense. But such a
> dynamic update of thread_max_suggested was left out for the sake of
> simplicity.
>
> Anyway if more than 100,000 threads are used on a system, I would wonder
> if the software should not be changed to use thread-pools instead.

You do not change the software to overcome artificial bounds based on
guessing.

So can we get back to the justification of the patch. What kind of
real life problem does it solve and why is it ok to override an admin
decision?
If there is no strong justification then the patch should be reverted
because from what I have heard it has been noticed and it has broken
a certain deployment. I am not really clear about technical details yet
but it seems that there are workloads that believe they need to touch
this tuning and complain if that is not possible.
--
Michal Hocko
SUSE Labs