Re: [PATCH resend RFC 0/9] s390: fixes, cleanups and optimizations for page table walkers

From: Christian Borntraeger
Date: Tue Sep 28 2021 - 07:06:44 EST



Am 28.09.21 um 12:59 schrieb Heiko Carstens:
On Thu, Sep 09, 2021 at 06:22:39PM +0200, David Hildenbrand wrote:
Resend because I missed ccing people on the actual patches ...

RFC because the patches are essentially untested and I did not actually
try to trigger any of the things these patches are supposed to fix. It
merely matches my current understanding (and what other code does :) ). I
did compile-test as far as possible.

After learning more about the wonderful world of page tables and their
interaction with the mmap_sem and VMAs, I spotted some issues in our
page table walkers that allow user space to trigger nasty behavior when
playing dirty tricks with munmap() or mmap() of hugetlb. While some issues
should be hard to trigger, others are fairly easy because we provide
conventient interfaces (e.g., KVM_S390_GET_SKEYS and KVM_S390_SET_SKEYS).

Future work:
- Don't use get_locked_pte() when it's not required to actually allocate
page tables -- similar to how storage keys are now handled. Examples are
get_pgste() and __gmap_zap.
- Don't use get_locked_pte() and instead let page fault logic allocate page
tables when we actually do need page tables -- also, similar to how
storage keys are now handled. Examples are set_pgste_bits() and
pgste_perform_essa().
- Maybe switch to mm/pagewalk.c to avoid custom page table walkers. For
__gmap_zap() that's very easy.

Cc: Christian Borntraeger <borntraeger@xxxxxxxxxx>
Cc: Janosch Frank <frankja@xxxxxxxxxxxxx>
Cc: Cornelia Huck <cohuck@xxxxxxxxxx>
Cc: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx>
Cc: Heiko Carstens <hca@xxxxxxxxxxxxx>
Cc: Vasily Gorbik <gor@xxxxxxxxxxxxx>
Cc: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>
Cc: Gerald Schaefer <gerald.schaefer@xxxxxxxxxxxxx>
Cc: Ulrich Weigand <Ulrich.Weigand@xxxxxxxxxx>

For the whole series:
Acked-by: Heiko Carstens <hca@xxxxxxxxxxxxx>

Christian, given that this is mostly about KVM I'd assume this should
go via the KVM tree. Patch 6 (pci_mmio) is already upstream.

Right, I think I will queue this even without testing for now.
Claudio, is patch 7 ok for you with the explanation from David?