Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions

From: Jan Kara
Date: Wed Dec 19 2018 - 06:35:50 EST


On Wed 19-12-18 21:28:25, Dave Chinner wrote:
> On Tue, Dec 18, 2018 at 08:03:29PM -0700, Jason Gunthorpe wrote:
> > On Wed, Dec 19, 2018 at 10:42:54AM +1100, Dave Chinner wrote:
> >
> > > Essentially, what we are talking about is how to handle broken
> > > hardware. I say we should just brun it with napalm and thermite
> > > (i.e. taint the kernel with "unsupportable hardware") and force
> > > wait_for_stable_page() to trigger when there are GUP mappings if
> > > the underlying storage doesn't already require it.
> >
> > If you want to ban O_DIRECT/etc from writing to file backed pages,
> > then just do it.
>
> O_DIRECT IO *isn't the problem*.

That is not true. O_DIRECT IO is a problem. In some aspects it is easier
than the problem with RDMA but currently O_DIRECT IO can crash your machine
or corrupt data the same way RDMA can. Just the race window is much
smaller. So we have to fix the generic GUP infrastructure to make O_DIRECT
IO work. I agree that fixing RDMA will likely require even more work like
revokable leases or what not.

> iO_DIRECT IO uses a short term pin that the existing prefaulting
> during GUP works just fine for. The problem we have is the long term
> pins where pages can be cleaned while the pages are pinned. i.e. the
> use case we current have to disable for DAX because *we can't make
> it work sanely* without either revokable file leases and/or hardware
> that is able to trigger page faults when they need write access to a
> clean page.

I would like to find a solution to the O_DIRECT IO problem while making the
infractructure reusable also for solving the problems with RDMA... Because
nobody wants to go through those couple hundred get_user_pages() users in
the kernel twice...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR