Re: [PATCH v3 1/4] mm: introduce get_user_pages_longterm
From: Dan Williams
Date: Mon Dec 04 2017 - 12:01:26 EST
On Mon, Dec 4, 2017 at 1:31 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Fri 01-12-17 08:29:53, Dan Williams wrote:
> > On Fri, Dec 1, 2017 at 8:02 AM, Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> > >
> > > On Fri, Dec 01, 2017 at 11:12:18AM +0100, Michal Hocko wrote:
> > > > On Thu 30-11-17 12:01:17, Jason Gunthorpe wrote:
> > > > > On Thu, Nov 30, 2017 at 10:32:42AM -0800, Dan Williams wrote:
> > > > > > > Who and how many LRU pages can pin that way and how do you prevent nasty
> > > > > > > users to DoS systems this way?
> > > > > >
> > > > > > I assume this is something the RDMA community has had to contend with?
> > > > > > I'm not an RDMA person, I'm just here to fix dax.
> > > > >
> > > > > The RDMA implementation respects the mlock rlimit
> > > >
> > > > OK, so then I am kind of lost in why do we need a special g-u-p variant.
> > > > The documentation doesn't say and quite contrary it assumes that the
> > > > caller knows what he is doing. This cannot be the right approach.
> > >
> > > I thought it was because get_user_pages_longterm is supposed to fail
> > > on DAX mappings?
> >
> > Correct, the rlimit checks are a separate issue,
> > get_user_pages_longterm is only there to avoid open coding vma lookup
> > and vma_is_fsdax() checks in multiple code paths.
>
> Then it is a terrible misnomer. One would expect this is a proper way to
> get a longterm pin on a page.
Yes, I can see that. The "get_user_pages_longterm" symbol name is
encoding the lifetime expectations of the caller vs properly
implementing 'longterm' pinning. However the proper interface to
establish a long term pin does not currently exist needs and
ultimately needs more coordination with userspace. We need a way for
the kernel to explicitly revoke the pin. So, this
get_user_pages_longterm change is only a stop-gap to prevent data
corruption and userspace from growing further expectations that
filesystem-dax supports long term pinning through the legacy
interfaces.
> > > And maybe we should think about moving the rlimit accounting into this
> > > new function too someday?
> >
> > DAX pages are not accounted in any rlimit because they are statically
> > allocated reserved memory regions.
>
> Which is OK, but how do you prevent anybody calling this function on
> normal LRU pages?
I don't, and didn't consider this angle as it's a consideration that
is missing from the existing gup interfaces. It is an additional gap
we need to fill.