Re: [RFC]: userspace memory reaping

From: Minchan Kim
Date: Thu Nov 05 2020 - 12:41:48 EST


On Thu, Nov 05, 2020 at 09:21:13AM -0800, Suren Baghdasaryan wrote:
> On Thu, Nov 5, 2020 at 9:16 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > On Thu 05-11-20 08:50:58, Suren Baghdasaryan wrote:
> > > On Thu, Nov 5, 2020 at 4:20 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > > >
> > > > On Wed 04-11-20 12:40:51, Minchan Kim wrote:
> > > > > On Wed, Nov 04, 2020 at 07:58:44AM +0100, Michal Hocko wrote:
> > > > > > On Tue 03-11-20 13:32:28, Minchan Kim wrote:
> > > > > > > On Tue, Nov 03, 2020 at 10:35:50AM +0100, Michal Hocko wrote:
> > > > > > > > On Mon 02-11-20 12:29:24, Suren Baghdasaryan wrote:
> > > > > > > > [...]
> > > > > > > > > To follow up on this. Should I post an RFC implementing SIGKILL_SYNC
> > > > > > > > > which in addition to sending a kill signal would also reap the
> > > > > > > > > victim's mm in the context of the caller? Maybe having some code will
> > > > > > > > > get the discussion moving forward?
> > > > > > > >
> > > > > > > > Yeah, having a code, even preliminary, might help here. This definitely
> > > > > > > > needs a good to go from process management people as that proper is land
> > > > > > > > full of surprises...
> > > > > > >
> > > > > > > Just to remind a idea I suggested to reuse existing concept
> > > > > > >
> > > > > > > fd = pidfd_open(victim process)
> > > > > > > fdatasync(fd);
> > > > > > > close(fd);
> > > > > >
> > > > > > I must have missed this proposal. Anyway, are you suggesting fdatasync
> > > > > > to act as a destructive operation?
> > > > >
> > > > > write(fd) && fdatasync(fd) are already destructive operation if the file
> > > > > is shared.
> > > >
> > > > I am likely missing something because fdatasync will not destroy any
> > > > underlying data. It will sync
> > > >
> > > > > You don't need to reaping as destruptive operation. Rather than, just
> > > > > commit on the asynchrnous status "write file into page cache and commit
> > > > > with fsync" and "killing process and commit with fsync".
> > > >
> > > > I am sorry but I do not follow. The result of the memory reaping is a
> > > > data loss. Any private mapping will simply lose it's content. The caller
> > > > will get EFAULT when trying to access it but there is no way to
> > > > reconstruct the data. This is everything but not resembling what I see
> > > > f{data}sync is used for.
> > >
> > > I think Minchan considers f{data}sync as a "commit" operation.
> >
> > But there is nothing like commit in that operation. It is simply a
> > destroy operation. ftruncate as Minchan mentions in another reply would
> > be a closer fit but how do you interpret the length argument? What about
> > memory regions which cannot be reaped?
> >
> > I do understand that reusing an existing mechanism is usually preferable
> > but the semantic should be reasonable and easy to reason about.
>
> Maybe then we can consider a flag for pidfd_send_signal() to indicate
> that we want a synchronous mm cleanup when SIGKILL is being sent?
> Similar to my original RFC but cleanup would happen in the context of
> the caller. That seems to me like the simplest and most obvious way of
> expressing what we want to accomplish. WDYT?

I think that's better than introducing a specific synchronous kill.