Re: [PATCH v3 1/2] x86/cpu/intel: Fix MTRR verification for TME enabled platforms

From: Compostella, Jeremy
Date: Fri Oct 13 2023 - 19:03:09 EST


"kirill.shutemov@xxxxxxxxxxxxxxx" <kirill.shutemov@xxxxxxxxxxxxxxx> writes:
> On Tue, Oct 03, 2023 at 02:06:52AM +0000, Huang, Kai wrote:
>> On Tue, 2023-10-03 at 01:47 +0300, kirill.shutemov@xxxxxxxxxxxxxxx wrote:
>> > On Fri, Sep 29, 2023 at 09:14:00AM +0000, Huang, Kai wrote:
>> > > On Thu, 2023-09-28 at 15:30 -0700, Compostella, Jeremy wrote:
>> > > > On TME enabled platform, BIOS publishes MTRR taking into account Total
>> > > > Memory Encryption (TME) reserved bits.
>> > > >
>> > > > generic_get_mtrr() performs a sanity check of the MTRRs relying on the
>> > > > `phys_hi_rsvd' variable which is set using the cpuinfo_x86 structure
>> > > > `x86_phys_bits' field. But at the time the generic_get_mtrr()
>> > > > function is ran the `x86_phys_bits' has not been updated by
>> > > > detect_tme() when TME is enabled.
>> > > >
>> > > > Since the x86_phys_bits does not reflect yet the real maximal physical
>> > > > address size yet generic_get_mtrr() complains by logging the following
>> > > > messages.
>> > > >
>> > > > mtrr: your BIOS has configured an incorrect mask, fixing it.
>> > > > mtrr: your BIOS has configured an incorrect mask, fixing it.
>> > > > [...]
>> > > >
>> > > > In such a situation, generic_get_mtrr() returns an incorrect size but
>> > > > no side effect were observed during our testing.
>> > > >
>> > > > For `x86_phys_bits' to be updated before generic_get_mtrr() runs,
>> > > > move the detect_tme() call from init_intel() to early_init_intel().
>> > >
>> > > Hi,
>> > >
>> > > This move looks good to me, but +Kirill who is the author of detect_tme() for
>> > > further comments.
>> > >
>> > > Also I am not sure whether it's worth to consider to move this to
>> > > get_cpu_address_sizes(), which calculates the virtual/physical address sizes.
>> > > Thus it seems anything that can impact physical address size could be put there.
>> >
>> > Actually, I am not sure how this patch works. AFAICS after the patch we
>> > have the following callchain:
>> >
>> > early_identify_cpu()
>> > this_cpu->c_early_init() (which is early_init_init())
>> > detect_tme()
>> > c->x86_phys_bits -= keyid_bits;
>> > get_cpu_address_sizes(c);
>> > c->x86_phys_bits = eax & 0xff;
>> >
>> > Looks like get_cpu_address_sizes() would override what detect_tme() does.
>>
>> After this patch, early_identify_cpu() calls get_cpu_address_sizes() first and
>> then calls c_early_init(), which calls detect_tme().
>>
>> So looks no override. No?

No override indeed as get_cpu_address_sizes() is always called before
early_init_intel or init_intel().

- init/main.c::start_kernel()
- arch/x86/kernel/setup.c::setup_arch()
- arch/x86/kernel/cpu/common.c::early_cpu_init()
- early_identify_cpu()
- get_cpu_address_sizes(c)
c->x86_phys_bits = eax & 0xff;
- arch/x86/kernel/cpu/intel.c::early_init_intel()
- detect_tme()
c->x86_phys_bits -= keyid_bits;
- arch/x86/kernel/cpu/common.c::arch_cpu_finalize_init()
- identify_boot_cpu()
- identify_cpu()
- get_cpu_address_sizes(c)
c->x86_phys_bits = eax & 0xff;
- arch/x86/kernel/cpu/intel.c::init_intel()
- early_init_intel()
- detect_tme()
c->x86_phys_bits -= keyid_bits;

> We identify CPU twice: once via early_cpu_init() and the second time via
> identify_boot_cpu()/identify_secondary_cpu(). I am talking about
> early_cpu_init() codepath.
>
> It might not matter in practice as of now, because it will get straight
> later, but CPU ident code is mess as it is. Let's not make it even worse.

This change is not modifying the CPU indent code, this is just
re-ordering detect_tme() call in the intel specifics hook so that the
information is available earlier as it is needed by
generic_get_mtrr(). This is similar to what is done in
arch/x86/kernel/cpu/amd.c.