Re: [RFC]: userspace memory reaping

From: Michal Hocko
Date: Thu Nov 05 2020 - 12:44:04 EST


On Thu 05-11-20 09:21:13, Suren Baghdasaryan wrote:
> On Thu, Nov 5, 2020 at 9:16 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > On Thu 05-11-20 08:50:58, Suren Baghdasaryan wrote:
> > > On Thu, Nov 5, 2020 at 4:20 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > > >
> > > > On Wed 04-11-20 12:40:51, Minchan Kim wrote:
> > > > > On Wed, Nov 04, 2020 at 07:58:44AM +0100, Michal Hocko wrote:
> > > > > > On Tue 03-11-20 13:32:28, Minchan Kim wrote:
> > > > > > > On Tue, Nov 03, 2020 at 10:35:50AM +0100, Michal Hocko wrote:
> > > > > > > > On Mon 02-11-20 12:29:24, Suren Baghdasaryan wrote:
> > > > > > > > [...]
> > > > > > > > > To follow up on this. Should I post an RFC implementing SIGKILL_SYNC
> > > > > > > > > which in addition to sending a kill signal would also reap the
> > > > > > > > > victim's mm in the context of the caller? Maybe having some code will
> > > > > > > > > get the discussion moving forward?
> > > > > > > >
> > > > > > > > Yeah, having a code, even preliminary, might help here. This definitely
> > > > > > > > needs a good to go from process management people as that proper is land
> > > > > > > > full of surprises...
> > > > > > >
> > > > > > > Just to remind a idea I suggested to reuse existing concept
> > > > > > >
> > > > > > > fd = pidfd_open(victim process)
> > > > > > > fdatasync(fd);
> > > > > > > close(fd);
> > > > > >
> > > > > > I must have missed this proposal. Anyway, are you suggesting fdatasync
> > > > > > to act as a destructive operation?
> > > > >
> > > > > write(fd) && fdatasync(fd) are already destructive operation if the file
> > > > > is shared.
> > > >
> > > > I am likely missing something because fdatasync will not destroy any
> > > > underlying data. It will sync
> > > >
> > > > > You don't need to reaping as destruptive operation. Rather than, just
> > > > > commit on the asynchrnous status "write file into page cache and commit
> > > > > with fsync" and "killing process and commit with fsync".
> > > >
> > > > I am sorry but I do not follow. The result of the memory reaping is a
> > > > data loss. Any private mapping will simply lose it's content. The caller
> > > > will get EFAULT when trying to access it but there is no way to
> > > > reconstruct the data. This is everything but not resembling what I see
> > > > f{data}sync is used for.
> > >
> > > I think Minchan considers f{data}sync as a "commit" operation.
> >
> > But there is nothing like commit in that operation. It is simply a
> > destroy operation. ftruncate as Minchan mentions in another reply would
> > be a closer fit but how do you interpret the length argument? What about
> > memory regions which cannot be reaped?
> >
> > I do understand that reusing an existing mechanism is usually preferable
> > but the semantic should be reasonable and easy to reason about.
>
> Maybe then we can consider a flag for pidfd_send_signal() to indicate
> that we want a synchronous mm cleanup when SIGKILL is being sent?
> Similar to my original RFC but cleanup would happen in the context of
> the caller. That seems to me like the simplest and most obvious way of
> expressing what we want to accomplish. WDYT?

Yes that would make sense. Althought it would have to be SIGKILL
specific flag IMO. But let's see what process management people think
about that.

--
Michal Hocko
SUSE Labs