Re: [PATCH v2 0/8] Fix several device private page reference counting issues
From: Vlastimil Babka (SUSE)
Date: Tue Oct 25 2022 - 06:21:36 EST
On 9/28/22 14:01, Alistair Popple wrote:
> This series aims to fix a number of page reference counting issues in
> drivers dealing with device private ZONE_DEVICE pages. These result in
> use-after-free type bugs, either from accessing a struct page which no
> longer exists because it has been removed or accessing fields within the
> struct page which are no longer valid because the page has been freed.
>
> During normal usage it is unlikely these will cause any problems. However
> without these fixes it is possible to crash the kernel from userspace.
> These crashes can be triggered either by unloading the kernel module or
> unbinding the device from the driver prior to a userspace task exiting. In
> modules such as Nouveau it is also possible to trigger some of these issues
> by explicitly closing the device file-descriptor prior to the task exiting
> and then accessing device private memory.
Hi, as this series was noticed to create a CVE [1], do you think a stable
backport is warranted? I think the "It is possible to launch the attack
remotely." in [1] is incorrect though, right?
It looks to me that patch 1 would be needed since the CONFIG_DEVICE_PRIVATE
introduction, while the following few only to kernels with 27674ef6c73f
(probably not so critical as that includes no LTS)?
Thanks,
Vlastimil
[1] https://nvd.nist.gov/vuln/detail/CVE-2022-3523
> This involves some minor changes to both PowerPC and AMD GPU code.
> Unfortunately I lack hardware to test either of those so any help there
> would be appreciated. The changes mimic what is done in for both Nouveau
> and hmm-tests though so I doubt they will cause problems.
>
> To: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> To: linux-mm@xxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
> Cc: nouveau@xxxxxxxxxxxxxxxxxxxxx
> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
>
> Alistair Popple (8):
> mm/memory.c: Fix race when faulting a device private page
> mm: Free device private pages have zero refcount
> mm/memremap.c: Take a pgmap reference on page allocation
> mm/migrate_device.c: Refactor migrate_vma and migrate_deivce_coherent_page()
> mm/migrate_device.c: Add migrate_device_range()
> nouveau/dmem: Refactor nouveau_dmem_fault_copy_one()
> nouveau/dmem: Evict device private memory during release
> hmm-tests: Add test for migrate_device_range()
>
> arch/powerpc/kvm/book3s_hv_uvmem.c | 17 +-
> drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 19 +-
> drivers/gpu/drm/amd/amdkfd/kfd_migrate.h | 2 +-
> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 11 +-
> drivers/gpu/drm/nouveau/nouveau_dmem.c | 108 +++++++----
> include/linux/memremap.h | 1 +-
> include/linux/migrate.h | 15 ++-
> lib/test_hmm.c | 129 ++++++++++---
> lib/test_hmm_uapi.h | 1 +-
> mm/memory.c | 16 +-
> mm/memremap.c | 30 ++-
> mm/migrate.c | 34 +--
> mm/migrate_device.c | 239 +++++++++++++++++-------
> mm/page_alloc.c | 8 +-
> tools/testing/selftests/vm/hmm-tests.c | 49 +++++-
> 15 files changed, 516 insertions(+), 163 deletions(-)
>
> base-commit: 088b8aa537c2c767765f1c19b555f21ffe555786