Re: [PATCH v3 0/1] mm: introduce put_user_page*(), placeholder versions
From: Dave Chinner
Date: Sun Mar 10 2019 - 18:48:04 EST
On Fri, Mar 08, 2019 at 03:08:40AM +0000, Christopher Lameter wrote:
> On Wed, 6 Mar 2019, john.hubbard@xxxxxxxxx wrote:
> > Direct IO
> > =========
> >
> > Direct IO can cause corruption, if userspace does Direct-IO that writes to
> > a range of virtual addresses that are mmap'd to a file. The pages written
> > to are file-backed pages that can be under write back, while the Direct IO
> > is taking place. Here, Direct IO races with a write back: it calls
> > GUP before page_mkclean() has replaced the CPU pte with a read-only entry.
> > The race window is pretty small, which is probably why years have gone by
> > before we noticed this problem: Direct IO is generally very quick, and
> > tends to finish up before the filesystem gets around to do anything with
> > the page contents. However, it's still a real problem. The solution is
> > to never let GUP return pages that are under write back, but instead,
> > force GUP to take a write fault on those pages. That way, GUP will
> > properly synchronize with the active write back. This does not change the
> > required GUP behavior, it just avoids that race.
>
> Direct IO on a mmapped file backed page doesnt make any sense.
People have used it for many, many years as zero-copy data movement
pattern. i.e. mmap the destination file, use direct IO to DMA direct
into the destination file page cache pages, fdatasync() to force
writeback of the destination file.
Now we have copy_file_range() to optimise this sort of data
movement, the need for games with mmap+direct IO largely goes away.
However, we still can't just remove that functionality as it will
break lots of random userspace stuff...
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx