Re: Cgroups "pids" controller does not update "pids.current" count immediately
From: Ivan Zahariev
Date: Fri Jun 15 2018 - 13:40:11 EST
On 15.6.2018 Ð. 19:16 Ñ., Tejun Heo wrote:
On Fri, Jun 15, 2018 at 07:07:27PM +0300, Ivan Zahariev wrote:
I understand all concerns and design decisions. However, having
I'm skeptical for two reasons.
RLIMIT_NPROC support combined with "cgroups" hierarchy would be very
Does it make sense that you introduce "nproc.current" and
"nproc.max" metrics which work in the same atomic, real-time way
like RLIMIT_NPROC? Or make this in a new "nproc" controller?
1. That doesn't sound much like a resource control problem but more of
a policy enforcement problem.
2. and it's difficult to see why such policies would need to be that
strict. Where is the requirement coming from?
The lazy pids accounting + modern fast CPUs makes the "pids.current"
metric practically unusable for resource limiting in our case. For a
test, when we started and ended one single process very quickly, we saw
"pids.current" equal up to 185 (while the correct value at all time is
either 0 or 1). If we want that a "cgroup" can spawn maximum 50
processes, we should use some high value like 300 for "pids.max", in
order to compensate the pids uncharge lag (and this depends on the speed
of the CPU and how busy the system is).
Our use-case is for a shared web hosting service. Our customers start a
CGI process for each PHP web request and therefore process start/end
happens at a very high rate. We don't want customers to be able to
launch too many CGI processes (NPROC limit) because this exhausts the
web & database servers, and probably obsesses Linux kernel resources
(like total "opened files" per user). Furthermore, some users are
malicious and launch fork-bombs and other resource-exhaustion attacks.
You may be right that we enforce a policy rather than resource control.
This has worked for us for 15+ years now. The motivation is that a
global RLIMIT_NPROC easily let's us limit all system and Linux kernel
resources "per customer" ("cgroups" allows us to limit only certain
system resources). Additionally, not all user-space daemons allow for a
granular "per user" limit or proper grouping (for example, MySQL has
only users, and no "per customer" groups support). Now we want to have
different "cgroups" hierarchies for a customer (SSH, CGI, Crond), each
with their own RLIMIT_NPROC, and a total RLIMIT_NPROC for the parent
"per customer" cgroup.
Excuse me for the lengthy post :-)