Re: [RFC v2] prctl: prctl(PR_SET_IDLE, PR_IDLE_MODE_KILLME), for stateless idle loops

From: Michal Hocko
Date: Tue Nov 21 2017 - 02:05:19 EST


On Mon 20-11-17 20:48:10, Shawn Landden wrote:
> On Mon, Nov 20, 2017 at 12:35 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > On Fri 17-11-17 20:45:03, Shawn Landden wrote:
> >> On Fri, Nov 3, 2017 at 2:09 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >>
> >> > On Thu 02-11-17 23:35:44, Shawn Landden wrote:
> >> > > It is common for services to be stateless around their main event loop.
> >> > > If a process sets PR_SET_IDLE to PR_IDLE_MODE_KILLME then it
> >> > > signals to the kernel that epoll_wait() and friends may not complete,
> >> > > and the kernel may send SIGKILL if resources get tight.
> >> > >
> >> > > See my systemd patch: https://github.com/shawnl/systemd/tree/prctl
> >> > >
> >> > > Android uses this memory model for all programs, and having it in the
> >> > > kernel will enable integration with the page cache (not in this
> >> > > series).
> >> > >
> >> > > 16 bytes per process is kinda spendy, but I want to keep
> >> > > lru behavior, which mem_score_adj does not allow. When a supervisor,
> >> > > like Android's user input is keeping track this can be done in
> >> > user-space.
> >> > > It could be pulled out of task_struct if an cross-indexing additional
> >> > > red-black tree is added to support pid-based lookup.
> >> >
> >> > This is still an abuse and the patch is wrong. We really do have an API
> >> > to use I fail to see why you do not use it.
> >> >
> >> When I looked at wait_queue_head_t it was 20 byes.
> >
> > I do not understand. What I meant to say is that we do have a proper
> > user api to hint OOM killer decisions.
> This is a FIFO queue, rather than a heuristic, which is all you get
> with the current API.

Yes I can read the code. All I am saing is that we already have an API
to achieve what you want or at least very similar.

Let me be explicit.
Nacked-by: Michal Hocko <mhocko@xxxxxxxx>
until it is sufficiently explained that the oom_score_adj is not
suitable and there are no other means to achieve what you need.
--
Michal Hocko
SUSE Labs