Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

From: Logan Gunthorpe
Date: Thu Jan 31 2019 - 14:44:31 EST




On 2019-01-31 12:35 p.m., Jerome Glisse wrote:
> So what is this O_DIRECT thing that keep coming again and again here :)
> What is the use case ? Note that bio will always have valid struct page
> of regular memory as using PCIE BAR for filesystem is crazy (you do not
> have atomic or cache coherence and many CPU instruction have _undefined_
> effect so what ever the userspace would do might do nothing.

The point is to be able to use a BAR as the source of data to write/read
from a file system. So as a simple example, if an NVMe drive had a CMB,
and you could map that CMB to userspace, you could do an O_DIRECT read
to the BAR on one drive and an O_DIRECT write from the BAR on another
drive. Thus you could bypass the upstream port of a switch (and
therefore all CPU resources) altogether.

For the most part nobody would want to put a filesystem on a BAR.
(Though there have been some crazy ideas to put persistent memory behind
a CMB...)

> Now if you want to use BAR address as destination or source of directIO
> then let just update the directIO code to handle this. There is no need
> to go hack every single place in the kernel that might deal with struct
> page or sgl. Just update the place that need to understand this. We can
> even update directIO to work on weird platform. The change to directIO
> will be small, couple hundred line of code at best.

Well if you want to figure out how to remove struct page from the entire
block layer that would help everybody. But until then, it's pretty much
impossible to use the block layer (and therefore O_DIRECT) without
struct page.

Logan