Am 11.09.19 um 12:10 schrieb Thomas HellstrÃm (VMware):
[SNIP]
Well our primary use case would be IO memory, cause system memory isBy incident, I got slightly sidetracked the other day and startedThe problem seen in TTM is that we want to be able to change theAh! I actually ran into this while implementing huge page support for
vm_page_prot from the fault handler, but it's problematic since we
have the mmap_sem typically only in read mode. Hence the fake vma
hack. From what I can tell it's reasonably well-behaved, since
pte_modify() skips the bits TTM updates, so mprotect() and mremap()
works OK. I think split_huge_pmd may run into trouble, but we don't
support it (yet) with TTM.
TTM and never figured out why that doesn't work. Dropped CPU huge page
support because of this.
looking at this as well. Got to the point where I figured out all the
hairy alignment issues and actually got huge_fault() calls, but never
implemented the handler. I think that's definitely something worth
having. Not sure it will work for IO memory, though, (split_huge_pmd
will just skip non-page-backed memory) but if we only support
VM_SHARED (non COW) vmas there's no reason to split the huge pmds
anyway. Definitely something we should have IMO.
only optionally allocate as huge page but we nearly always allocate VRAM
in chunks of at least 2MB because we otherwise get a huge performance
penalty.
Alternatively we could introduce a new VM_* flag telling users ofI agree. This is needed for huge pages. We should make this change,We could probably get away with a WRITE_ONCE() update of theYeah, that's exactly why I always wondered why we need this hack with
vm_page_prot before taking the page table lock since
a) We're locking out all other writers.
b) We can't race with another fault to the same vma since we hold an
address space lock ("buffer object reservation")
c) When we need to update there are no valid page table entries in the
vma, since it only happens directly after mmap(), or after an
unmap_mapping_range() with the same address space lock. When another
reader (for example split_huge_pmd()) sees a valid page table entry,
it also sees the new page protection and things are fine.
the vma copy on the stack.
But that would really be a special case. To solve this properly we'dWell we already have a special lock for this: The reservation object. So
probably need an additional lock to protect the vm_flags and
vm_page_prot, taken after mmap_sem and i_mmap_lock.
memory barriers etc should be in place and I also think we can just
update the vm_page_prot on the fly.
and perhaps add the justification above as a comment.
vm_page_prot to just let the pages table entries be filled by faults again
Christian.