Re: [HMM 00/15] HMM (Heterogeneous Memory Management) v23
From: Jerome Glisse
Date: Fri Jun 16 2017 - 14:04:20 EST
On Fri, Jun 16, 2017 at 05:55:52PM +0000, Bridgman, John wrote:
> >-----Original Message-----
> >From: Jerome Glisse [mailto:jglisse@xxxxxxxxxx]
> >Sent: Friday, June 16, 2017 10:48 AM
> >To: Bridgman, John
> >Cc: akpm@xxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> >mm@xxxxxxxxx; Dan Williams; Kirill A . Shutemov; John Hubbard; Sander, Ben;
> >Kuehling, Felix
> >Subject: Re: [HMM 00/15] HMM (Heterogeneous Memory Management) v23
> >
> >On Fri, Jun 16, 2017 at 07:22:05AM +0000, Bridgman, John wrote:
> >> Hi Jerome,
> >>
> >> I'm just getting back to this; sorry for the late responses.
> >>
> >> Your description of HMM talks about blocking CPU accesses when a page
> >> has been migrated to device memory, and you treat that as a "given" in
> >> the HMM design. Other than BAR limits, coherency between CPU and
> >> device caches and performance on read-intensive CPU accesses to device
> >> memory are there any other reasons for this ?
> >
> >Correct this is the list of reasons for it. Note that HMM is more of a toolboox
> >that one monolithic thing. For instance you also have the HMM-CDM patchset
> >that does allow to have GPU memory map to the CPU but this rely on CAPI or
> >CCIX to keep same memory model garanty.
> >
> >
> >> The reason I'm asking is that we make fairly heavy use of large BAR
> >> support which allows the CPU to directly access all of the device
> >> memory on each of the GPUs, albeit without cache coherency, and there
> >> are some cases where it appears that allowing CPU access to the page
> >> in device memory would be more efficient than constantly migrating
> >> back and forth.
> >
> >The thing is we are designing toward random program and we can not make
> >any assumption on what kind of instruction a program might run on such
> >memory. So if program try to do atomic on it iirc it is un- define what is
> >suppose to happen.
>
> Thanks... thought I was missing something from the list. Agree that we
> need to provide consistent behaviour, and we definitely care about atomics.
> If we could get consistent behaviour with the page still in device memory
> are you aware of any other problems related to HMM itself ?
Well only way to get consistent is with CCIX or CAPI bus, i would need to
do an in depth reading of PCIE but from my memory this isn't doable with
any of the existing PCIE standard.
Note that i have HMM-CDM especially for the case you have cache coherent
device memory that behave just like regular memory. When you use HMM-CDM
and you migrate to GPU memory the page is still map into the CPU address
space. HMM-CDM is a separate patchset that i posted couple days ago.
So if you have cache coherent device memory that behave like regular memory
what you want is HMM-CDM and when migrated thing are still map into the
CPU page table.
Jérôme