Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup

From: Luis R. Rodriguez
Date: Wed Mar 16 2016 - 16:21:04 EST

On Thu, Nov 05, 2015 at 11:43:59AM -0800, Luis R. Rodriguez wrote:
> On Thu, Nov 5, 2015 at 11:14 AM, Yinghai Lu <yinghai@xxxxxxxxxx> wrote:
> > On Mon, Sep 14, 2015 at 7:46 AM, Stuart Hayes <stuart.w.hayes@xxxxxxxxx> wrote:
> >>
> >> Booting with 'disable_mtrr_cleanup' works, but the system I am working with
> >> isn't actually failing--it just gets ugly error messages. And the BIOS on the
> >> system I am working with had set up the MTRRs correctly.
> >
> > Please post boot log and /proc/mtrr for:
> > 1. without your patch
> > 2. without your patch and with disable_mtrr_cleanup in boot command line.
> > 3. with your patch.
> Stuart,
> to provide some context -- I reached out to Yinghai as he wrote the
> original mtrr cleanup code. The commit logs seem to read that a crash
> was possible on systems with > 4 GiB RAM with some types of BIOSes...
> The cleanup code seems to trigger when variable MTRRs do not exist
> that are UC, or when all varible MTRRs that exist are just UC + WB
> (Yinghai correct me if I'm wrong). The commit log in question
> (95ffa2438d0e9 "x86: mtrr cleanup for converting continuous to
> discrete layout, v8") was not very clear about the cause of the crash
> -- but suppose the issue here was the BIOS on some systems might want
> to create some UC variable MTRRs early on and there was no UC MTRRs
> available, and I can only guess the cleanup exists as hack for those
> BIOSes. Even if that was the case -- its still not clear *why* the
> crash would happen but I suppose a driver mishap can happen without UC
> guarantees for some devices the BIOS may want to enable UC MTRR on.
> To be able to determine what we do upstream we need to understand the
> above first. We also need to understand if the cleanup might also be
> implicated by userspace drivers using /proc/mtrr, or if a proprietary
> driver exists that does use mtrr_add() directly even though PAT has
> been available for ages and all drivers are now properly converted.
> With clear answers to the above we'll be able to determine what the
> right course of action should be for this patch. For instance I'm
> inclined to strive to disable the complex cleanup code if we don't
> need it anymore, but if we do need it your patch makes sense. If the
> patch makes sense then though are we going to have to keep updating
> the segment size *every time* as systems grow? That seems rather
> silly. And if PAT is prevalent why are vendors adding MTRRs still? The
> cleanup seems complex and a major hack for a fix for some BIOSes, I'd
> much rather identify the exact issue and only have a fix to address
> that case.

I never heard back... so let's take this up on the other thread I just