Re: [PATCHv3 0/8] Linear Address Masking enabling

From: Kostya Serebryany
Date: Fri Jun 10 2022 - 16:22:31 EST


Thanks for working on this, please make LAM happen.
It enables efficient memory safety testing that is already available on AArch64.

Memory error detectors, such as ASAN and Valgrind (or KASAN for the kernel)
have limited applicability, primarily because of their run-time overheads
(CPU, RAM, and code size). In many cases, the major obstacle to a wider
deployment is the RAM overhead, which is typically 2x-3x. There is another tool,
HWASAN [1], which solves the same problem and has < 10% RAM overhead.
This tool is available only on AArch64 because it relies on the
top-byte-ignore (TBI)
feature. Full support for that feature [2] has been added to the
kernel in order to
enable HWASAN. Adding support for LAM will enable HWASAN on x86_64.

HWASAN is already the main memory safety tool for Android [3] - the reduced RAM
overhead allowed us to utilize this testing tool where ASAN’s RAM overhead was
prohibitive. We have also prototyped the x86_64 variant of HWASAN, and we can
observe that it is a major improvement over ASAN. The kernel support
and hardware
availability are the only missing parts.

Making HWASAN available on x86_64 will enable developers of server and
client software
to scale up their memory safety testing, and thus improve the quality
and security of their products.


[1] https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
[2] https://www.kernel.org/doc/html/latest/arm64/tagged-address-abi.html
[3] https://source.android.com/devices/tech/debug/hwasan

--kcc


On Fri, Jun 10, 2022 at 7:35 AM Kirill A. Shutemov
<kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
>
> Linear Address Masking[1] (LAM) modifies the checking that is applied to
> 64-bit linear addresses, allowing software to use of the untranslated
> address bits for metadata.
>
> The patchset brings support for LAM for userspace addresses.
>
> LAM_U48 enabling is controversial since it competes for bits with
> 5-level paging. Its enabling isolated into an optional last patch that
> can be applied at maintainer's discretion.
>
> Please review and consider applying.
>
> v3:
> - Rebased onto v5.19-rc1
> - Per-process enabling;
> - API overhaul (again);
> - Avoid branches and costly computations in the fast path;
> - LAM_U48 is in optional patch.
> v2:
> - Rebased onto v5.18-rc1
> - New arch_prctl(2)-based API
> - Expose status of LAM (or other thread features) in
> /proc/$PID/arch_status
>
> [1] ISE, Chapter 14.
> https://software.intel.com/content/dam/develop/external/us/en/documents-tps/architecture-instruction-set-extensions-programming-reference.pdf
>
> Kirill A. Shutemov (8):
> x86/mm: Fix CR3_ADDR_MASK
> x86: CPUID and CR3/CR4 flags for Linear Address Masking
> mm: Pass down mm_struct to untagged_addr()
> x86/mm: Handle LAM on context switch
> x86/uaccess: Provide untagged_addr() and remove tags before address check
> x86/mm: Provide ARCH_GET_UNTAG_MASK and ARCH_ENABLE_TAGGED_ADDR
> x86: Expose untagging mask in /proc/$PID/arch_status
> x86/mm: Extend LAM to support to LAM_U48
>
> arch/arm64/include/asm/memory.h | 4 +-
> arch/arm64/include/asm/signal.h | 2 +-
> arch/arm64/include/asm/uaccess.h | 4 +-
> arch/arm64/kernel/hw_breakpoint.c | 2 +-
> arch/arm64/kernel/traps.c | 4 +-
> arch/arm64/mm/fault.c | 10 +--
> arch/sparc/include/asm/pgtable_64.h | 2 +-
> arch/sparc/include/asm/uaccess_64.h | 2 +
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/elf.h | 3 +-
> arch/x86/include/asm/mmu.h | 2 +
> arch/x86/include/asm/mmu_context.h | 58 +++++++++++++++++
> arch/x86/include/asm/processor-flags.h | 2 +-
> arch/x86/include/asm/tlbflush.h | 3 +
> arch/x86/include/asm/uaccess.h | 44 ++++++++++++-
> arch/x86/include/uapi/asm/prctl.h | 3 +
> arch/x86/include/uapi/asm/processor-flags.h | 6 ++
> arch/x86/kernel/Makefile | 2 +
> arch/x86/kernel/fpu/xstate.c | 47 --------------
> arch/x86/kernel/proc.c | 50 +++++++++++++++
> arch/x86/kernel/process.c | 3 +
> arch/x86/kernel/process_64.c | 54 +++++++++++++++-
> arch/x86/kernel/sys_x86_64.c | 5 +-
> arch/x86/mm/hugetlbpage.c | 6 +-
> arch/x86/mm/mmap.c | 9 ++-
> arch/x86/mm/tlb.c | 62 ++++++++++++++-----
> .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +-
> drivers/gpu/drm/radeon/radeon_gem.c | 2 +-
> drivers/infiniband/hw/mlx4/mr.c | 2 +-
> drivers/media/common/videobuf2/frame_vector.c | 2 +-
> drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +-
> .../staging/media/atomisp/pci/hmm/hmm_bo.c | 2 +-
> drivers/tee/tee_shm.c | 2 +-
> drivers/vfio/vfio_iommu_type1.c | 2 +-
> fs/proc/task_mmu.c | 2 +-
> include/linux/mm.h | 11 ----
> include/linux/uaccess.h | 11 ++++
> lib/strncpy_from_user.c | 2 +-
> lib/strnlen_user.c | 2 +-
> mm/gup.c | 6 +-
> mm/madvise.c | 2 +-
> mm/mempolicy.c | 6 +-
> mm/migrate.c | 2 +-
> mm/mincore.c | 2 +-
> mm/mlock.c | 4 +-
> mm/mmap.c | 2 +-
> mm/mprotect.c | 2 +-
> mm/mremap.c | 2 +-
> mm/msync.c | 2 +-
> virt/kvm/kvm_main.c | 2 +-
> 51 files changed, 342 insertions(+), 126 deletions(-)
> create mode 100644 arch/x86/kernel/proc.c
>
> --
> 2.35.1
>