Re: [PATCH 2/2] x86/mtrr: Refactor PAT initialization code

From: Luis R. Rodriguez
Date: Tue Mar 29 2016 - 19:44:08 EST

On Tue, Mar 29, 2016 at 5:16 PM, Toshi Kani <toshi.kani@xxxxxxx> wrote:
> On Tue, 2016-03-29 at 15:12 -0700, Luis R. Rodriguez wrote:
>> On Tue, Mar 29, 2016 at 2:46 PM, Toshi Kani <toshi.kani@xxxxxxx> wrote:
>> > On Tue, 2016-03-29 at 10:14 -0700, Luis R. Rodriguez wrote:
>> > > On Fri, Mar 18, 2016 at 2:35 PM, Toshi Kani <toshi.kani@xxxxxxx>
>> > > wrote:
> :
>> > >
>> > > Do we really need UC for the fan?
>> >
>> > When you say "we", are you referring Xen guests? Xen guests do not
>> > need to control the fan, so they do not need UC set in MTRRs.
>> >
>> > In general, yes, MMIO registers need UC when they need to be accessed.
>> Curious, what does a BIOS do for fan control when MTRRs are disabled?
> You mean, when the kernel modified the MTRR setup and disabled them.

Nope, but the below is good to know!

I meant to ask about the case where the option the lets a user go in a
muck with BIOS settings to disable MTRR e xists and the user disables
MTRR. What would happen for fan control in such situations? I'd
imagine such cases allow for a system to exist with proper fan
control, and allow the kernel to boot without having to deal with the
pesky MTRRs at all, while PAT lives on, no?

> would assume the original setup and still access the registers. This may
> lead to undefined behavior and may result in a system crash.
>> Also what if a BIOS just set MSR_MTRRdefType to uncachable only ?
> Many BIOSes actually set the default type to UC.

Thanks, I asked as I saw my BIOS uses write-back by default. Good to
know there are different strategies.

> MTRRs then cover regular memory with WB.

When you say regular memory you mean everything else we see as RAM? I
was under the impression we'd only need MTRR for a special range of
memory, and its up to implementation how they are used. If you can use
MTRR to change the cache attribute for regular RAM and if this is
actually a requirement if the default MTRR is UC then one way or
another a BIOS seems to always require MTRR, either for UC setting for
fan control or WB for regular RAM, is that right?

>> Wouldn't that help simplify the BIOS when systems are known as not
>> wanting to deal with reading MTRRs on the kernel front, even if its
>> just to read the setup ?
> Nope.
>> I'm trying to determine exactly why a BIOS cannot simply enable use an
>> alternative for what it needs for fan control and let the kernel live
>> without any MTRR code at run time as an option. Although the
>> documentation says that the same "procedure" is needed for PAT setup,
>> I see it possible to split the skeleton of the code and have each
>> peace of code live separately and compartmentalized, they'd just have
>> respective calls on the skeleton of the procedure.
> I agree that the MTRR rendezvous handler can be improved for PAT, but I do
> not see a compelling reason to make such change now. With my fix, I think
> the code works reasonably for Xen.

Agreed, don't think its needed now, my questions are for future optimizations.

>> > > What is the default for PAT?
>> >
>> > There is no such thing as the default for PAT.
>> >
>> > > Can't
>> > > the same be used so that we way by default all ranges match what is
>> > > also the default by PAT? Would that really break fan control ? If we
>> > > have a match should't we be able to not have to worry about MTRRs at
>> > > all in-kernel even on bare metal?
>> >
>> > We do not need to know about BIOS impl, such as fan control, etc. The
>> > point is that if BIOS sets MTRRs, then the kernel keeps their setup.
>> Right, if the kernel no longer uses it directly it seems like an
>> aweful lot of code to keep updating simply for a BIOS requirement, I'm
>> trying to see if we can have the option to live without this
>> requirement.
> Please be aware of the hibernation case. I think this procedure involves
> setting MTRRs back to the original setup.

Eek, right, so best just disable them if we can.

>> > If (virtual) BIOS does not enable MTRRs, the kernel keeps them
>> > disabled. We just need not to mess with the setup.
>> Sure, thanks! I'm trying to see if we can have a similar option on bare
>> metal.
>> > > Another option, which I've alluded to on the Xen thread is skipping
>> > > over the MTRR space from the e820 map. Is that not possible ? This
>> > > could be last resort... but which I'm hinting more for the Xen side
>> > > of things if we *really* need get_mtrr() on the Xen guest side of
>> > > things...
>> >
>> > There is no MTRR space in the e820 map since they are MSRs. Since Xen
>> > guests disable MTRRs, I do not think you have any issue here...
>> Xen seems to clip the e820 map given to a guest in certain MTRR
>> conditions, see init_e820(), this calls
>> machine_specific_memory_setup() which later clips MTRR if
>> mtrr_top_of_ram(). This is an Intel check that trims the e820 map if
>> MTRRs were found to be enabled and the default MTRR is not write-back.
>> If returns the address of the first non write-back variable MTRR, it
>> uses clip_to_limit() to limit the exposed memory [0], notice how
>> clip_to_limit() is also used to generally limit exposed memory through
>> the opt_mem boot parameter as well. Its not exactly clear why that's
>> done, but this looks very similar to the Linux MTRR cleanup -- see
>> x86_get_mtrr_mem_range().
>> [0];a=blob;f=xen/arch/x86/e820.c
> It looks to me that the code makes sure all E820_RAM ranges in the e820
> table are covered by WB entries of MTRRs. If not, it trims the e820 table.
> I suppose it tries to react on a case when someone modified MTRRs and
> resulted in mismatch with the e820 table. I'd think you do not need this
> code as long as you do not modify the MTRR setup.

Great thanks for that -- another optimization possible.