Re: [PATCH v4 05/15] arm64: Mark all I/O as non-secure shared
From: Steven Price
Date: Wed Jul 10 2024 - 11:35:17 EST
On 09/07/2024 13:54, Suzuki K Poulose wrote:
> Hi Will
>
> On 09/07/2024 12:39, Will Deacon wrote:
>> On Mon, Jul 01, 2024 at 10:54:55AM +0100, Steven Price wrote:
>>> All I/O is by default considered non-secure for realms. As such
>>> mark them as shared with the host.
>>>
>>> Co-developed-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
>>> Signed-off-by: Steven Price <steven.price@xxxxxxx>
>>> ---
>>> Changes since v3:
>>> * Add PROT_NS_SHARED to FIXMAP_PAGE_IO rather than overriding
>>> set_fixmap_io() with a custom function.
>>> * Modify ioreamp_cache() to specify PROT_NS_SHARED too.
>>> ---
>>> arch/arm64/include/asm/fixmap.h | 2 +-
>>> arch/arm64/include/asm/io.h | 8 ++++----
>>> 2 files changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/fixmap.h
>>> b/arch/arm64/include/asm/fixmap.h
>>> index 87e307804b99..f2c5e653562e 100644
>>> --- a/arch/arm64/include/asm/fixmap.h
>>> +++ b/arch/arm64/include/asm/fixmap.h
>>> @@ -98,7 +98,7 @@ enum fixed_addresses {
>>> #define FIXADDR_TOT_SIZE (__end_of_fixed_addresses << PAGE_SHIFT)
>>> #define FIXADDR_TOT_START (FIXADDR_TOP - FIXADDR_TOT_SIZE)
>>> -#define FIXMAP_PAGE_IO __pgprot(PROT_DEVICE_nGnRE)
>>> +#define FIXMAP_PAGE_IO __pgprot(PROT_DEVICE_nGnRE | PROT_NS_SHARED)
>>> void __init early_fixmap_init(void);
>>> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
>>> index 4ff0ae3f6d66..07fc1801c6ad 100644
>>> --- a/arch/arm64/include/asm/io.h
>>> +++ b/arch/arm64/include/asm/io.h
>>> @@ -277,12 +277,12 @@ static inline void __const_iowrite64_copy(void
>>> __iomem *to, const void *from,
>>> #define ioremap_prot ioremap_prot
>>> -#define _PAGE_IOREMAP PROT_DEVICE_nGnRE
>>> +#define _PAGE_IOREMAP (PROT_DEVICE_nGnRE | PROT_NS_SHARED)
>>> #define ioremap_wc(addr, size) \
>>> - ioremap_prot((addr), (size), PROT_NORMAL_NC)
>>> + ioremap_prot((addr), (size), (PROT_NORMAL_NC | PROT_NS_SHARED))
>>> #define ioremap_np(addr, size) \
>>> - ioremap_prot((addr), (size), PROT_DEVICE_nGnRnE)
>>> + ioremap_prot((addr), (size), (PROT_DEVICE_nGnRnE | PROT_NS_SHARED))
>>
>> Hmm. I do wonder whether you've pushed the PROT_NS_SHARED too far here.
>>
>> There's nothing _architecturally_ special about the top address bit.
>> Even if the RSI divides the IPA space in half, the CPU doesn't give two
>> hoots about it in the hardware. In which case, it feels wrong to bake
>> PROT_NS_SHARED into ioremap_prot -- it feels much better to me if the
>> ioremap() code OR'd that into the physical address when passing it down
This is really just a simplification given we don't (yet) have device
assignment.
> Actually we would like to push the decision of applying the
> "pgprot_decrypted" vs pgprot_encrypted into ioremap_prot(), rather
> than sprinkling every user of ioremap_prot().
>
> This could be made depending on the address that is passed on to the
> ioremap_prot(). I guess we would need explicit requests from the callers
> to add "encrypted vs decrypted". Is that what you guys are looking at ?
There's a missing piece at the moment in terms of how the guest is going
to identify whether a particular device is protected or shared (i.e. a
real assigned device, or emulated by the VMM). When that's added then I
was expecting ioremap_prot() to provide that flag based on discovering
whether the address range passed in is for an assigned device or not.
>>
>> There's a selfish side of that argument, in that we need to hook
>> ioremap() for pKVM protected guests, but I do genuinely feel that
>> treating address bits as protection bits is arbitrary and doesn't belong
>> in these low-level definitions. In a similar vein, AMD has its
I'd be interested to see how pKVM will handle both protected and
emulated (by the VMM) devices. Although we have the 'top bit' flag it's
actually a pain to pass that down to the guest as a flag to use for this
purpose (e.g. 32 bit PCI BARs are too small). So our current thought is
an out-of-band request to identify whether a particular address
corresponds to a protected device or not. We'd then set the top bit
appropriately.
>> sme_{set,clr}() macros that operate on the PA (e.g. via dma_to_phys()),
>> which feels like a more accurate abstraction to me.
>
> I believe that doesn't solve all the problems. They do have a hook in
> __ioremap_caller() that implicitly applies pgprot_{en,de}crypted
> depending on other info.
This is the other option - which pushes the knowledge down to the
individual drivers to decide whether a region is 'encrypted' (i.e.
protected) or not. It's more flexible, but potentially requires 'fixing'
many drivers to understand this.
Thanks,
Steve
> Cheers
> Suzuki
>
>>
>> Will
>