Re: [PATCH 3/3 staging-next] mm: Remove RCU and tasklocks from lmk
From: Michal Hocko
Date: Fri Feb 10 2017 - 02:57:44 EST
On Fri 10-02-17 08:39:11, peter enderborg wrote:
> On 02/09/2017 09:05 PM, Michal Hocko wrote:
> > On Thu 09-02-17 14:21:52, peter enderborg wrote:
> >> Fundamental changes:
> >> 1 Does NOT take any RCU lock in shrinker functions.
> >> 2 It returns same result for scan and counts, so we dont need to do
> >> shinker will know when it is pointless to call scan.
> >> 3 It does not lock any other process than the one that is
> >> going to be killed.
> >> Background.
> >> The low memory killer scans for process that can be killed to free
> >> memory. This can be cpu consuming when there is a high demand for
> >> memory. This can be seen by analysing the kswapd0 task work.
> >> The stats function added in earler patch adds a counter for waste work.
> >> How it works.
> >> This patch create a structure within the lowmemory killer that caches
> >> the user spaces processes that it might kill. It is done with a
> >> sorted rbtree so we can very easy find the candidate to be killed,
> >> and knows its properies as memory usage and sorted by oom_score_adj
> >> to look up the task with highest oom_score_adj. To be able to achive
> >> this it uses oom_score_notify events.
> >> This patch also as a other effect, we are now free to do other
> >> lowmemorykiller configurations. Without the patch there is a need
> >> for a tradeoff between freed memory and task and rcu locks. This
> >> is no longer a concern for tuning lmk. This patch is not intended
> >> to do any calculation changes other than we do use the cache for
> >> calculate the count values and that makes kswapd0 to shrink other
> >> areas.
> > I have to admit I really do not understand big part of the above
> > paragraph as well as how this all is supposed to work. A quick glance
> > over the implementation. __lmk_task_insert seems to be only called from
> > the oom_score notifier context. If nobody updates the value then no task
> > will get into the tree. Or am I missing something really obvious here?
> > Moreover oom scores tend to be mostly same for tasks. That means that
> > your sorted tree will become sorted by pids in most cases. I do not see
> > any sorting based on the rss nor any updates that would reflect updates
> > of rss. How can this possibly work?
> The task tree nodes are created,updated or removed from the notifier when
> there is a relevant oom_score_adj change. If no one create a task that
> is in the range for the lowmemorykiller the tree will be empty. This is
> an android feature so the score will be updated very often. It is
> part of activity manager to prioritise tasks. Why should we do sort of
Because the current lmk selects the tasks based on rss. And the patch
doesn't explain why this is no longer suitable and a different metric
shoult be used. If you also consider that the scale of oom_score_adj is
quite small, conllisions when you simply sort based on pids which is
more than questionable. I really fail to see how this can work
reasonably and why the change of the lmk semantic is even acceptable.