Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

From: Koenig, Christian
Date: Wed Jan 30 2019 - 05:33:47 EST


Am 30.01.19 um 09:02 schrieb Christoph Hellwig:
> On Tue, Jan 29, 2019 at 08:58:35PM +0000, Jason Gunthorpe wrote:
>> On Tue, Jan 29, 2019 at 01:39:49PM -0700, Logan Gunthorpe wrote:
>>
>>> implement the mapping. And I don't think we should have 'special' vma's
>>> for this (though we may need something to ensure we don't get mapping
>>> requests mixed with different types of pages...).
>> I think Jerome explained the point here is to have a 'special vma'
>> rather than a 'special struct page' as, really, we don't need a
>> struct page at all to make this work.
>>
>> If I recall your earlier attempts at adding struct page for BAR
>> memory, it ran aground on issues related to O_DIRECT/sgls, etc, etc.
> Struct page is what makes O_DIRECT work, using sgls or biovecs, etc on
> it work. Without struct page none of the above can work at all. That
> is why we use struct page for backing BARs in the existing P2P code.
> Not that I'm a particular fan of creating struct page for this device
> memory, but without major invasive surgery to large parts of the kernel
> it is the only way to make it work.

The problem seems to be that struct page does two things:

1. Memory management for system memory.
2. The object to work with in the I/O layer.

This was done because a good part of that stuff overlaps, like reference
counting how often a page is used. The problem now is that this doesn't
work very well for device memory in some cases.

For example on GPUs you usually have a large amount of memory which is
not even accessible by the CPU. In other words you can't easily create a
struct page for it because you can't reference it with a physical CPU
address.

Maybe struct page should be split up into smaller structures? I mean
it's really overloaded with data.

Christian.