Re: [PATCH v8 1/2] x86/mm: add .bss..decrypted section to hold shared variables
From: Brijesh Singh
Date: Fri Sep 14 2018 - 07:59:34 EST
On 9/14/18 2:10 AM, Borislav Petkov wrote:
> On Thu, Sep 13, 2018 at 04:51:10PM -0500, Brijesh Singh wrote:
>> kvmclock defines few static variables which are shared with the
>> hypervisor during the kvmclock initialization.
> ...
>
>> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>> index 8047379..c16af27 100644
>> --- a/arch/x86/kernel/head64.c
>> +++ b/arch/x86/kernel/head64.c
>> @@ -112,6 +112,7 @@ static bool __head check_la57_support(unsigned long physaddr)
>> unsigned long __head __startup_64(unsigned long physaddr,
>> struct boot_params *bp)
>> {
>> + unsigned long vaddr, vaddr_end;
>> unsigned long load_delta, *p;
>> unsigned long pgtable_flags;
>> pgdval_t *pgd;
>> @@ -235,6 +236,21 @@ unsigned long __head __startup_64(unsigned long physaddr,
>> sme_encrypt_kernel(bp);
>>
>> /*
>> + * Clear the memory encryption mask from the .bss..decrypted section.
>> + * The bss section will be memset to zero later in the initialization so
>> + * there is no need to zero it after changing the memory encryption
>> + * attribute.
>> + */
>> + if (mem_encrypt_active()) {
>> + vaddr = (unsigned long)__start_bss_decrypted;
>> + vaddr_end = (unsigned long)__end_bss_decrypted;
>> + for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
>> + i = pmd_index(vaddr);
>> + pmd[i] -= sme_get_me_mask();
>> + }
>> + }
> Why isn't this chunk at the end of sme_encrypt_kernel() instead of here?
The sme_encrypt_kernel() does not have access to pmd (after pointer
fixup is applied). You can extend the sme_encrypt_kernel() to pass an
additional arguments but then we start getting in include hell. The pmd
is defined as "pmdval_t". If we extend the sme_encrypt_kernel() thenÂ
asm/mem_encrypt.h need to include the header file which defines
"pmdval_t". Adding the 'asm/pgtable_type.h' was causing all kind of
compilation errors. I didn't spend much time on it. IMO, we really don't
need to go in this path unless we see some value from doing this.
>> + /*
>> * Return the SME encryption mask (if SME is active) to be used as a
>> * modifier for the initial pgdir entry programmed into CR3.
>> */
>> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
>> index 9c77d2d..0d618ee 100644
>> --- a/arch/x86/kernel/vmlinux.lds.S
>> +++ b/arch/x86/kernel/vmlinux.lds.S
>> @@ -65,6 +65,23 @@ jiffies_64 = jiffies;
>> #define ALIGN_ENTRY_TEXT_BEGIN . = ALIGN(PMD_SIZE);
>> #define ALIGN_ENTRY_TEXT_END . = ALIGN(PMD_SIZE);
>>
>> +/*
>> + * This section contains data which will be mapped as decrypted. Memory
>> + * encryption operates on a page basis. Make this section PMD-aligned
>> + * to avoid splitting the pages while mapping the section early.
>> + *
>> + * Note: We use a separate section so that only this section gets
>> + * decrypted to avoid exposing more than we wish.
>> + */
>> +#define BSS_DECRYPTED \
>> + . = ALIGN(PMD_SIZE); \
>> + __start_bss_decrypted = .; \
>> + *(.bss..decrypted); \
>> + . = ALIGN(PAGE_SIZE); \
>> + __start_bss_decrypted_unused = .; \
>> + . = ALIGN(PMD_SIZE); \
>> + __end_bss_decrypted = .; \
>> +
>> #else
>>
>> #define X86_ALIGN_RODATA_BEGIN
>> @@ -74,6 +91,7 @@ jiffies_64 = jiffies;
>>
>> #define ALIGN_ENTRY_TEXT_BEGIN
>> #define ALIGN_ENTRY_TEXT_END
>> +#define BSS_DECRYPTED
>>
>> #endif
>>
>> @@ -345,6 +363,7 @@ SECTIONS
>> __bss_start = .;
>> *(.bss..page_aligned)
>> *(.bss)
>> + BSS_DECRYPTED
>> . = ALIGN(PAGE_SIZE);
>> __bss_stop = .;
> Putting it in the BSS would need a bit of care in the future as it poses
> a certain ordering on the calls in x86_64_start_kernel() (not that there
> isn't any now :)):
Hmm, I thought since we are explicitly saying that attribute will add
the variables in the bss section so caller should not be using the
variable before bss is ready. If you are concerns that clear_bss() may
get called after the variable is used then we could memset(0) when while
clearing the C-bit. I did try to add some comments when we are clearing
the C-bit. I can include some verbatim in BSS_DECRYPTED section as well.
> You have:
>
> x86_64_start_kernel:
>
> ...
>
> clear_bss();
>
> ...
>
> x86_64_start_reservations() -> ... -> setup_arch() -> kvmclock_init()
>
> so it would be prudent to put at least a comment somewhere, say, over
> the definition of BSS_DECRYPTED or so, that attention should be paid to
> the clear_bss() call early.