Re: [PATCH 1/2] mm/madvise: allow process_madvise operations on entire memory range

From: Suren Baghdasaryan
Date: Tue Dec 22 2020 - 23:10:16 EST


On Tue, Dec 22, 2020 at 9:48 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Tue, Dec 22, 2020 at 5:44 AM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> >
> > On Fri, Dec 11, 2020 at 09:27:46PM +0100, Jann Horn wrote:
> > > > Can we just use one element in iovec to indicate entire address rather
> > > > than using up the reserved flags?
> > > >
> > > > struct iovec {
> > > > .iov_base = NULL,
> > > > .iov_len = (~(size_t)0),
> > > > };
> > >
> > > In addition to Suren's objections, I think it's also worth considering
> > > how this looks in terms of compat API. If a compat process does
> > > process_madvise() on another compat process, it would be specifying
> > > the maximum 32-bit number, rather than the maximum 64-bit number, so
> > > you'd need special code to catch that case, which would be ugly.
> > >
> > > And when a compat process uses this API on a non-compat process, it
> > > semantically gets really weird: The actual address range covered would
> > > be larger than the address range specified.
> > >
> > > And if we want different access checks for the two flavors in the
> > > future, gating that different behavior on special values in the iovec
> > > would feel too magical to me.
> > >
> > > And the length value SIZE_MAX doesn't really make sense anyway because
> > > the length of the whole address space would be SIZE_MAX+1, which you
> > > can't express.
> > >
> > > So I'm in favor of a new flag, and strongly against using SIZE_MAX as
> > > a magic number here.
> >
> > Yes, using SIZE_MAX is a horrible interface in this case. I'm not
> > a huge fan of a flag either. What is the use case for the madvise
> > to all of a processes address space anyway?
>
> Thanks for the feedback! The use case is userspace memory reaping
> similar to oom-reaper. Detailed justification is here:
> https://lore.kernel.org/linux-mm/20201124053943.1684874-1-surenb@xxxxxxxxxx

Actually this post in the most informative and includes test results:
https://lore.kernel.org/linux-api/CAJuCfpGz1kPM3G1gZH+09Z7aoWKg05QSAMMisJ7H5MdmRrRhNQ@xxxxxxxxxxxxxx/