Re: [PATCH V2,2/2] mm: madvise: skip unmapped vma holes passed to process_madvise

From: Suren Baghdasaryan
Date: Thu Mar 17 2022 - 12:53:23 EST


On Thu, Mar 17, 2022 at 9:28 AM Minchan Kim <minchan@xxxxxxxxxx> wrote:
>
> On Wed, Mar 16, 2022 at 02:29:06PM -0700, Andrew Morton wrote:
> > On Wed, 16 Mar 2022 19:49:38 +0530 Charan Teja Kalla <quic_charante@xxxxxxxxxxx> wrote:
> >
> > > > IMO, it's worth to note in man page.
> > > >
> > >
> > > Or the current patch for just ENOMEM is sufficient here and we just have
> > > to update the man page?
> >
> > I think the "On success, process_madvise() returns the number of bytes
> > advised" behaviour sounds useful. But madvise() doesn't do that.
> >
> > RETURN VALUE
> > On success, madvise() returns zero. On error, it returns -1 and errno
> > is set to indicate the error.
> >
> > So why is it desirable in the case of process_madvise()?
>
> Since process_madvise deal with multiple ranges and could fail at one of
> them in the middle or pocessing, people could decide where the call
> failed and then make a strategy whether they will abort at the point or
> continue to hint next addresses. Here, problem of the strategy is API
> doesn't return any error vaule if it has processed any bytes so they
> would have limitation to decide a policy. That's the limitation for
> every vector IO syscalls, unfortunately.
>
> >
> >
> >
> > And why was process_madvise() designed this way? Or was it
> > always simply an error in the manpage?

Taking a closer look, indeed manpage seems to be wrong.
https://elixir.bootlin.com/linux/v5.17-rc8/source/mm/madvise.c#L1154
indicates that in the presence of unmapped holes madvise will skip
them but will return ENOMEM and that's what process_madvise is
ultimately returning in this case. So, the manpage claim of "This
return value may be less than the total number of requested bytes, if
an error occurred after some iovec elements were already processed."
does not reflect the reality in our case because the return value will
be -ENOMEM. After the desired behavior is finalized I'll modify the
manpage accordingly.