Re: [RFC 2/2] signal: extend pidfd_send_signal() to allow expedited process killing

From: Roman Gushchin
Date: Thu Apr 11 2019 - 17:45:54 EST


On Thu, Apr 11, 2019 at 10:09:06AM -0700, Suren Baghdasaryan wrote:
> On Thu, Apr 11, 2019 at 8:33 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, Apr 10, 2019 at 06:43:53PM -0700, Suren Baghdasaryan wrote:
> > > Add new SS_EXPEDITE flag to be used when sending SIGKILL via
> > > pidfd_send_signal() syscall to allow expedited memory reclaim of the
> > > victim process. The usage of this flag is currently limited to SIGKILL
> > > signal and only to privileged users.
> >
> > What is the downside of doing expedited memory reclaim? ie why not do it
> > every time a process is going to die?

Hello, Suren!

I also like the idea to reap always.

> I think with an implementation that does not use/abuse oom-reaper
> thread this could be done for any kill. As I mentioned oom-reaper is a
> limited resource which has access to memory reserves and should not be
> abused in the way I do in this reference implementation.

In most OOM cases it doesn't matter that much which task to reap,
so I don't think that reusing the oom-reaper thread is bad.
It should be relatively easy to tweak in a way, that it won't
wait for mmap_sem if there are other tasks waiting to be reaped.
Also, the oom code add to the head of the list, and the expedited
killing to the end, or something like this.

The only think, if we're going to reap all tasks, we probably
want to have a per-node oom_reaper thread.

Thanks!