Re: [RFC PATCH 1/2] mm,drm/ttm: Block fast GUP to TTM huge pages
From: Jason Gunthorpe
Date: Thu Mar 25 2021 - 14:25:28 EST
On Thu, Mar 25, 2021 at 07:13:33PM +0100, Thomas Hellström (Intel) wrote:
>
> On 3/25/21 6:55 PM, Jason Gunthorpe wrote:
> > On Thu, Mar 25, 2021 at 06:51:26PM +0100, Thomas Hellström (Intel) wrote:
> > > On 3/24/21 9:25 PM, Dave Hansen wrote:
> > > > On 3/24/21 1:22 PM, Thomas Hellström (Intel) wrote:
> > > > > > We also have not been careful at *all* about how _PAGE_BIT_SOFTW* are
> > > > > > used. It's quite possible we can encode another use even in the
> > > > > > existing bits.
> > > > > >
> > > > > > Personally, I'd just try:
> > > > > >
> > > > > > #define _PAGE_BIT_SOFTW5 57 /* available for programmer */
> > > > > >
> > > > > OK, I'll follow your advise here. FWIW I grepped for SW1 and it seems
> > > > > used in a selftest, but only for PTEs AFAICT.
> > > > >
> > > > > Oh, and we don't care about 32-bit much anymore?
> > > > On x86, we have 64-bit PTEs when running 32-bit kernels if PAE is
> > > > enabled. IOW, we can handle the majority of 32-bit CPUs out there.
> > > >
> > > > But, yeah, we don't care about 32-bit. :)
> > > Hmm,
> > >
> > > Actually it makes some sense to use SW1, to make it end up in the same dword
> > > as the PSE bit, as from what I can tell, reading of a 64-bit pmd_t on 32-bit
> > > PAE is not atomic, so in theory a huge pmd could be modified while reading
> > > the pmd_t making the dwords inconsistent.... How does that work with fast
> > > gup anyway?
> > It loops to get an atomic 64 bit value if the arch can't provide an
> > atomic 64 bit load
>
> Hmm, ok, I see a READ_ONCE() in gup_pmd_range(), and then the resulting pmd
> is dereferenced either in try_grab_compound_head() or __gup_device_huge(),
> before the pmd is compared to the value the pointer is currently pointing
> to. Couldn't those dereferences be on invalid pointers?
Uhhhhh.. That does look questionable, yes. Unless there is some tricky
reason why a 64 bit pmd entry on a 32 bit arch either can't exist or
has a stable upper 32 bits..
The pte does it with ptep_get_lockless(), we probably need the same
for the other levels too instead of open coding a READ_ONCE?
Jason