Re: [PATCH 2/2] x86/mtrr: Refactor PAT initialization code

From: Luis R. Rodriguez
Date: Tue Mar 29 2016 - 18:13:17 EST


On Tue, Mar 29, 2016 at 2:46 PM, Toshi Kani <toshi.kani@xxxxxxx> wrote:
> On Tue, 2016-03-29 at 10:14 -0700, Luis R. Rodriguez wrote:
>> On Fri, Mar 18, 2016 at 2:35 PM, Toshi Kani <toshi.kani@xxxxxxx> wrote:
>> > On Thu, 2016-03-17 at 17:06 -0700, Luis R. Rodriguez wrote:
>> > > On Mar 17, 2016 2:04 PM, "Toshi Kani" <toshi.kani@xxxxxxx> wrote:
>> > > >
> :
>> >
>> > I do not see any issue for Xen, but sure, we can discuss about Xen in a
>> > separate thread.
>>
>> While on point -- I'll just wanted to clarify that a while ago you had
>> hinted we needed to have Xen return a valid type with
>> mtrr_type_lookup(), I then also explained how Xen disables MTRR [0],
>> it was however unclear if you still believe mtrr_type_lookup() is
>> needed on the guest side. Jan had pointed out that the Xen Hypervisor
>> implements the XENPF_read_memtype hypercall. On the recent thread I
>> posted [1] I got into my review of the prospects of implementing
>> support for using this hypercall on Linux xen guests and issues and
>> concerns with it. Please feel free to follow up there and we can take
>> up the other items below here as they relate to bare metal.
>
> What I said in the email [0] was that "When MTRRs are enabled, the kernel
> needs to check through mtrr_type_lookup()".
>
> Since Xen guests have MTRRs disabled, this statement does not apply. It
> returns MTRR_TYPE_INVALID when disabled, and this is fine for the guests.

Ah, I see thanks for the clarification!

>> [0] http://lkml.kernel.org/r/20150903235429.GZ8051@xxxxxxxxxxxxx
>> [1] http://lkml.kernel.org/r/CAB=NE6UTp0T=rbOcAg88iPsfBJneY7O5-3c11VfFgAP
>> iepoTNg@xxxxxxxxxxxxxx
>>
>> > > > > On x86 Linux code we now have ioremap_uc() that can't use MTRR
>> > > > > behind the scenes, why would something like this on the BIOS not
>> > > > > be possible? That ultimately uses set_pte_at(). What limitations
>> > > > > are there on the BIOS that prevent us from just using strong UC
>> > > > > for PAT on the BIOS?
>> > > >
>> > > > Because it requires to run in virtual mode with page tables.
>> > >
>> > > Ah... interesting... is UC really needed, what is the default? If the
>> > > default is used would there be an issue ? Can such work be deferred
>> > > to a later time ? It seems like a high burden to require on large
>> > > piece of legacy architecture to just blow a fan.
>> >
>> > The default cache attribute (i.e. ranges not covered by MTRRs) is
>> > specified by the MTRR default type MSR.
>>
>> Do we really need UC for the fan?
>
> When you say "we", are you referring Xen guests? Xen guests do not need to
> control the fan, so they do not need UC set in MTRRs.
>
> In general, yes, MMIO registers need UC when they need to be accessed.

Curious, what does a BIOS do for fan control when MTRRs are disabled?

Also what if a BIOS just set MSR_MTRRdefType to uncachable only ?
Wouldn't that help simplify the BIOS when systems are known as not
wanting to deal with reading MTRRs on the kernel front, even if its
just to read the setup ?

I'm trying to determine exactly why a BIOS cannot simply enable use an
alternative for what it needs for fan control and let the kernel live
without any MTRR code at run time as an option. Although the
documentation says that the same "procedure" is needed for PAT setup,
I see it possible to split the skeleton of the code and have each
peace of code live separately and compartmentalized, they'd just have
respective calls on the skeleton of the procedure.

>> What is the default for PAT?
>
> There is no such thing as the default for PAT.
>
>> Can't
>> the same be used so that we way by default all ranges match what is
>> also the default by PAT? Would that really break fan control ? If we
>> have a match should't we be able to not have to worry about MTRRs at
>> all in-kernel even on bare metal?
>
> We do not need to know about BIOS impl, such as fan control, etc. The
> point is that if BIOS sets MTRRs, then the kernel keeps their setup.

Right, if the kernel no longer uses it directly it seems like an
aweful lot of code to keep updating simply for a BIOS requirement, I'm
trying to see if we can have the option to live without this
requirement.

> If (virtual) BIOS does not enable MTRRs, the kernel keeps them disabled. We
> just need not to mess with the setup.

Sure, thanks! I'm trying to see if we can have a similar option on bare metal.

>> Another option, which I've alluded to on the Xen thread is skipping
>> over the MTRR space from the e820 map. Is that not possible ? This
>> could be last resort... but which I'm hinting more for the Xen side of
>> things if we *really* need get_mtrr() on the Xen guest side of
>> things...
>
> There is no MTRR space in the e820 map since they are MSRs. Since Xen
> guests disable MTRRs, I do not think you have any issue here...

Xen seems to clip the e820 map given to a guest in certain MTRR
conditions, see init_e820(), this calls
machine_specific_memory_setup() which later clips MTRR if
mtrr_top_of_ram(). This is an Intel check that trims the e820 map if
MTRRs were found to be enabled and the default MTRR is not write-back.
If returns the address of the first non write-back variable MTRR, it
uses clip_to_limit() to limit the exposed memory [0], notice how
clip_to_limit() is also used to generally limit exposed memory through
the opt_mem boot parameter as well. Its not exactly clear why that's
done, but this looks very similar to the Linux MTRR cleanup -- see
x86_get_mtrr_mem_range().

[0] http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/e820.c

Luis