On Thu 03-01-19 10:40:54, Yang Shi wrote:
In that case I do not see a strong reason to implement the offloding
On 1/3/19 10:13 AM, Michal Hocko wrote:
On Thu 03-01-19 09:33:14, Yang Shi wrote:I would say it has not to be strictly sequential. The above script is just
On 1/3/19 2:12 AM, Michal Hocko wrote:Yes, this makes sense to me.
On Thu 03-01-19 04:05:30, Yang Shi wrote:Please see the below explanation.
Currently, force empty reclaims memory synchronously when writing toWhy it is suboptimal? We are doing that operation on behalf of the
memory.force_empty. It may take some time to return and the afterwards
operations are blocked by it. Although it can be interrupted by signal,
it still seems suboptimal.
process requesting it. What should anybody else pay for it? In other
words why should we hide the overhead?
We have some usecases which create and remove memcgs very frequently, andNow css offline is handled by worker, and the typical usecase of forceHmm, so I guess you are talking about
empty is before memcg offline. So, handling force empty in css offline
sounds reasonable.
echo 1 > $MEMCG/force_empty
rmdir $MEMCG
and you are complaining that the operation takes too long. Right? Why do
you care actually?
the tasks in the memcg may just access the files which are unlikely accessed
by anyone else. So, we prefer force_empty the memcg before rmdir'ing it to
reclaim the page cache so that they don't get accumulated to incur
unnecessary memory pressure. Since the memory pressure may incur direct
reclaim to harm some latency sensitive applications.
And, the create/remove might be run in a script sequentially (there might beIs there any reason for your scripts to be strictly sequential here? In
a lot scripts or applications are run in parallel to do this), i.e.
mkdir cg1
do something
echo 0 > cg1/memory.force_empty
rmdir cg1
mkdir cg2
...
The creation of the afterwards memcg might be blocked by the force_empty for
long time if there are a lot page caches, so the overall throughput of the
system may get hurt.
other words why cannot you offload those expensive operations to a
detached context in _userspace_?
an example to illustrate the pattern. But, sometimes it may hit such pattern
due to the complicated cluster scheduling and container scheduling in the
production environment, for example the creation process might be scheduled
to the same CPU which is doing force_empty. I have to say I don't know too
much about the internals of the container scheduling.
into the kernel. It is an additional code and semantic to maintain.
I think it is more important to discuss whether we want to introduce
force_empty in cgroup v2.