Re: counting file descriptors with a cgroup controller
From: Krzysztof Opasiak
Date: Tue Mar 07 2017 - 07:35:06 EST
Hi
On 03/06/2017 07:58 PM, Tejun Heo wrote:
Hello,
On Fri, Feb 17, 2017 at 12:37:11PM +0100, Krzysztof Opasiak wrote:
We need to limit and monitor the number of file descriptors processes
keep open. If a process exceeds certain limit we'd like to terminate it
and restart it or reboot the whole system. Currently the RLIMIT API
allows limiting the number of file descriptors but to achieve our goals
we'd need to make sure all programmes we run handle EMFILE errno
properly. That is why we consider developing a cgroup controller that
limits the number of open file descriptors of its members (similar to
memory controler).
Any comments? Is there any alternative that:
+ does not require modifications of user-land code,
+ enables other process (e.g. init) to be notified and apply policy.
Hmm... I'm not quite sure fds qualify as an independent system-wide
resource. We did that for pids because pids are globally limited and
can run out way earlier than memory backing it. I don't think we have
similar restructions for fds, do we?
Well I'm not aware of such restrictions...
So maybe let me clarify our use case so we can have some more discussion
about this. We are dealing with task of monitoring system services on an
IoT system. So this system needs to run as long as possible without
reboot just like server. In server world almost whole system state is
being monitored by services like nagios. They measure each parameter
(like cpu, memory etc) with some interval. Unfortunately we cannot use
this it in an embedded system due to power consumption.
So generally now we consider two approaches:
1) Use rlimits when possible to limit resources for each process.
The problem here is that this creates an implicit requirement that all
system services are well written and able to detect that they for
example run out of fd and they will just exit with a suitable error code
instead of hanging forever and responding to clients that they are
unable to handle their request due to lack of fd. This is hard specially
when service use a lot of libraries under the hood because they also
need to return this error code from each functions which opens some
files. This is especially hard when using some proprietary services or
libraries for we don't have access to source code.
2) Use cgroups to limit and monitor resources usage
Generally systemd creates a cgroup for each service. cgroups like memory
cgroup has an ability to notify userspace when memory usage reaches some
level. So for example systemd could get notification that one of cgroups
is using more memory than it should but as long as it's not a hard limit
of the cgroup this service is not going to even notice this. So instead
of returning error from for example malloc() in service, systemd could
just send signal to that service and ask it to exit gracefully and the
restart it. The disadvantage of this solution is the need of having
cgroup for each resource we would like to monitor. For now we have
suitable cgroups for everything we need apart from file descriptors.
What do you think about this? Maybe you have some other ideas how we
could achieve this?
Best regards,
--
Krzysztof Opasiak
Samsung R&D Institute Poland
Samsung Electronics