Re: `AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y` causes AMDGPU to fail on Ryzen: amdgpu: SME is not compatible with RAVEN

From: Alex Deucher
Date: Wed Oct 06 2021 - 14:10:46 EST


On Wed, Oct 6, 2021 at 1:48 PM Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> Ok,
>
> so I sat down and wrote something and tried to capture all the stuff we
> so talked about that it is clear in the future why we did it.
>
> Thoughts?
>
> ---
> From: Borislav Petkov <bp@xxxxxxx>
> Date: Wed, 6 Oct 2021 19:34:55 +0200
> Subject: [PATCH] x86/Kconfig: Do not enable AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
> automatically
>
> This Kconfig option was added initially so that memory encryption is
> enabled by default on machines which support it.
>
> However, Raven-class GPUs, a.o., cannot handle DMA masks which are
> shorter than the bit position of the encryption, aka C-bit. For that,
> those devices need to have the IOMMU present.

This is not limited to Raven. All GPUs (and quite a few other
devices) have a limited DMA mask. AMD GPUs have between 32 and 48
bits of DMA depending on what generation the hardware is. So to
support SME, you either need swiotlb with bounce buffers or you need
IOMMU in remapping mode. The limitation with Raven is that if you want
to use it with the IOMMU enabled it requires the IOMMU to be set up in
passthrough mode to support IOMMUv2 functionality for compute support
and due to other hardware limitations on the display side. So for all
GPUs except raven, just having IOMMU enabled in remapping mode is
fine. GPUs from other vendors would likely run into similar
limitations. Raven just has further limitations.


>
> If the IOMMU is disabled or in passthrough mode, though, the kernel
> would switch to SWIOTLB bounce-buffering for those transfers.
>
> In order to avoid that,
>
> 2cc13bb4f59f ("iommu: Disable passthrough mode when SME is active")
>
> disables the default IOMMU passthrough mode so that devices for which
> the default 256K DMA is insufficient, can use the IOMMU instead.
>
> However 2, there are cases where the IOMMU is disabled in the BIOS, etc,
> think the usual hardware folk "oops, I dropped the ball there" cases.
>
> Which means, it can happen that there are systems out there with devices
> which need the IOMMU to function properly with SME enabled but the IOMMU
> won't necessarily be enabled.
>
> So in order for those devices to function, drop the "default y" for
> the SME by default on option so that users who want to have SME, will
> need to either enable it in their config or use "mem_encrypt=on" on the
> kernel command line.

Another option would be to enable SME by default on Epyc platforms,
but disabled by default on client APU platforms or even just raven.

Other than these comments, looks fine to me.

Alex

>
> Fixes: 7744ccdbc16f ("x86/mm: Add Secure Memory Encryption (SME) support")
> Reported-by: Paul Menzel <pmenzel@xxxxxxxxxxxxx>
> Signed-off-by: Borislav Petkov <bp@xxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx>
> Link: https://lkml.kernel.org/r/8bbacd0e-4580-3194-19d2-a0ecad7df09c@xxxxxxxxxxxxx
> ---
> arch/x86/Kconfig | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 8055da49f1c0..6a336b1f3f28 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1525,7 +1525,6 @@ config AMD_MEM_ENCRYPT
>
> config AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
> bool "Activate AMD Secure Memory Encryption (SME) by default"
> - default y
> depends on AMD_MEM_ENCRYPT
> help
> Say yes to have system memory encrypted by default if running on
> --
> 2.29.2
>
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette