On Wed 23-01-19 12:24:38, Yang Shi wrote:
Yeah, I would be interested in the worst case direct reclaim latencies.
On 1/23/19 1:59 AM, Michal Hocko wrote:
On Wed 23-01-19 04:09:42, Yang Shi wrote:What numbers do you mean? How long did it take to iterate all the memcgs?
In current implementation, both kswapd and direct reclaim has to iterateCan you provide some numbers?
all mem cgroups. It is not a problem before offline mem cgroups could
be iterated. But, currently with iterating offline mem cgroups, it
could be very time consuming. In our workloads, we saw over 400K mem
cgroups accumulated in some cases, only a few hundred are online memcgs.
Although kswapd could help out to reduce the number of memcgs, direct
reclaim still get hit with iterating a number of offline memcgs in some
cases. We experienced the responsiveness problems due to this
occassionally.
For now I don't have the exact number for the production environment, but
the unresponsiveness is visible.
You can get that from our vmscan tracepoints quite easily.
I had some test number with triggering direct reclaim with 8k memcgsHaving real world numbers would definitely help with the justification.
artificially, which has just one clean page charged for each memcg, so the
reclaim is cheaper than real production environment.
perf shows it took around 220ms to iterate 8k memcgs:
ÂÂÂÂÂÂÂÂÂÂÂÂÂ dd 13873 [011]ÂÂ 578.542919:
vmscan:mm_vmscan_direct_reclaim_begin
ÂÂÂÂÂÂÂÂÂÂÂÂÂ dd 13873 [011]ÂÂ 578.758689:
vmscan:mm_vmscan_direct_reclaim_end
So, iterating 400K would take at least 11s in this artificial case. The
production environment is much more complicated, so it would take much
longer in fact.
Yes, but for that you do not need to check for global_reclaim right?Yes, you are right. I missed this point.Here just break the iteration once it reclaims enough pages as whatOK, this makes some sense to me. The purpose of the direct reclaim is
memcg direct reclaim does. This may hurt the fairness among memcgs
since direct reclaim may awlays do reclaim from same memcgs. But, it
sounds ok since direct reclaim just tries to reclaim SWAP_CLUSTER_MAX
pages and memcgs can be protected by min/low.
to reclaim some memory and throttle the allocation pace. The iterator is
cached so the next reclaimer on the same hierarchy will simply continue
so the fairness should be more or less achieved.
Btw. is there any reason to keep !global_reclaim() check in place? WhyIterating all memcgs in kswapd is still useful to help to reduce those
is it not sufficient to exclude kswapd?
zombie memcgs.