Re: Should implementations of ->direct_access be allowed to sleep?

From: Dave Chinner
Date: Thu Mar 26 2015 - 15:32:45 EST


On Thu, Mar 26, 2015 at 01:09:18PM -0400, Matthew Wilcox wrote:
> On Tue, Mar 24, 2015 at 11:50:47AM -0700, Matt Mullins wrote:
> > We're also developing a user of direct_access, and we ended up with some
> > questions about the sleeping guarantees of the direct_access API.
>
> That's a great question. Since DAX can always sleep when it's calling
> into bdev_direct_access(), I hadn't thought about it (DAX is basically
> called to handle page faults and do I/O; both of which are expected
> to sleep).
>
> > Since brd is currently the only (x86) implementation of DAX in Linus's tree,
> > I've been testing against that. We noticed that the brd implementation of DAX
> > can call into alloc_page() with __GFP_WAIT if we call direct_access() on a page
> > that has not yet been allocated. This is compounded by the fact that brd does
> > not support size > PAGE_SIZE (and thus I call bdev_direct_access() on each use),
> > though the limitation makes sense -- I shouldn't expect the brd driver to be
> > able to allocate a gigabyte of contiguous memory.
> >
> > The potential sleeping behavior was somewhat surprising to me, as I would expect
> > the NV-DIMM device implementation to simply offset the pfn at which the device
> > is located rather than perform a memory allocation. What are the guaranteed
> > and/or expected contexts from which direct_access() can be safely called?
>
> Yes, for 'real' NV-DIMM devices, as you can see by the ones in tree,
> as well as the pmem driver that Ross has been posting, it's a simple
> piece of arithmetic. The question is whether we should make all users
> of ->direct_access accommodate brd, or whether we should change brd so
> that it doesn't sleep.
>
> I'm leaning towards the latter. But I'm not sure what GFP flags to
> recommend that brd use ... GFP_NOWAIT | __GFP_ZERO, perhaps?

What, so we get random IO failures under memory pressure?

I really think we should allow .direct_access to sleep. It means we
can use existing drivers and it also allows future implementations
that might require, say, RDMA to be performed to update a page
before access is granted. i.e. .direct_access is the first hook into
the persistent device at page fault time....

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/