Re: [PATCH v4 2/6] x86: document WC MTRR effects on PAT / non-PAT pages

From: Borislav Petkov
Date: Mon May 04 2015 - 08:23:34 EST


On Wed, Apr 29, 2015 at 02:44:07PM -0700, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@xxxxxxxx>
>
> As part of the effort to phase out MTRR use document
> write-combining MTRR effects on pages with different
> non-PAT page attributes flags and different PAT entry
> values. Extend arch_phys_wc_add() documentation to
> clarify power of two sizes / boundary requirements as
> we phase out mtrr_add() use.
>
> Lastly hint towards ioremap_uc() for corner cases on
> device drivers working with devices with mixed regions
> where MTRR size requirements would otherwise not
> enable write-combining effective memory types.
>
> Cc: Toshi Kani <toshi.kani@xxxxxx>
> Cc: Jonathan Corbet <corbet@xxxxxxx>
> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> Cc: Suresh Siddha <sbsiddha@xxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Juergen Gross <jgross@xxxxxxxx>
> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
> Cc: Dave Airlie <airlied@xxxxxxxxxx>
> Cc: Antonino Daplas <adaplas@xxxxxxxxx>
> Cc: Jean-Christophe Plagniol-Villard <plagnioj@xxxxxxxxxxxx>
> Cc: Tomi Valkeinen <tomi.valkeinen@xxxxxx>
> Cc: Ville SyrjÃlà <syrjala@xxxxxx>
> Cc: Mel Gorman <mgorman@xxxxxxx>
> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxx>
> Cc: Davidlohr Bueso <dbueso@xxxxxxx>
> Cc: linux-fbdev@xxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Signed-off-by: Luis R. Rodriguez <mcgrof@xxxxxxxx>
> ---
> Documentation/x86/mtrr.txt | 18 +++++++++++++++---
> Documentation/x86/pat.txt | 40 +++++++++++++++++++++++++++++++++++++++-
> arch/x86/kernel/cpu/mtrr/main.c | 3 +++
> 3 files changed, 57 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/x86/mtrr.txt b/Documentation/x86/mtrr.txt
> index cc071dc..a111a6c 100644
> --- a/Documentation/x86/mtrr.txt
> +++ b/Documentation/x86/mtrr.txt
> @@ -1,7 +1,19 @@
> MTRR (Memory Type Range Register) control
> -3 Jun 1999
> -Richard Gooch
> -<rgooch@xxxxxxxxxxxxx>
> +
> +Richard Gooch <rgooch@xxxxxxxxxxxxx> - 3 Jun 1999
> +Luis R. Rodriguez <mcgrof@xxxxxxxxxxxxxxxx> - April 9, 2015
> +
> +===============================================================================
> +Phasing MTRR use

"Phasing out...".

> +
> +MTRR use is replaced on modern x86 hardware with PAT. Over time the only type
> +of effective MTRR that is expected to be supported will be for write-combining.
> +As MTRR use is phased out device drivers should use arch_phys_wc_add() to make
> +MTRR effective on non-PAT systems while a no-op on PAT enabled systems.
> +
> +For details refer to Documentation/x86/pat.txt.
> +
> +===============================================================================
>
> On Intel P6 family processors (Pentium Pro, Pentium II and later)
> the Memory Type Range Registers (MTRRs) may be used to control
> diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt
> index cf08c9f..7e183e3 100644
> --- a/Documentation/x86/pat.txt
> +++ b/Documentation/x86/pat.txt
> @@ -34,6 +34,8 @@ ioremap | -- | UC- | UC- |
> | | | |
> ioremap_cache | -- | WB | WB |
> | | | |
> +ioremap_uc | -- | UC | UC |
> + | | | |
> ioremap_nocache | -- | UC- | UC- |
> | | | |
> ioremap_wc | -- | -- | WC |
> @@ -102,7 +104,43 @@ wants to export a RAM region, it has to do set_memory_uc() or set_memory_wc()
> as step 0 above and also track the usage of those pages and use set_memory_wb()
> before the page is freed to free pool.
>
> -
> +MTRR effects on PAT / non-PAT systems
> +-------------------------------------
> +
> +The following table provides the effects of using write-combining MTRRs when
> +using ioremap*() calls on x86 for both non-PAT and PAT systems. Ideally
> +mtrr_add() usage will be phased in favor of arch_phys_wc_add() which will

out

> +be a no-op on PAT enabled systems. The region over which a arch_phys_wc_add()
> +is made should already have be ioremap'd with write-combining page attributes

, have been ioremapped with WC attributes...

> +or PAT entries, this can be done by using ioremap_wc() / or respective helpers.
> +Devices which combine areas of IO memory desired to remain uncachable with
> +areas where write-combining is desirable and are restricted by the size
> +requirements of MTRRs should consider splitting up their IO memory space
> +cleanly with ioremap_uc() and ioremap_wc() followed by an arch_phys_wc_add()
> +encompassing both regions. Such use is nevertheless heavily discouraged as
> +the effective memory type is considered implementation defined. This strategy
> +should only be used as last resort on devices with size-contrained regions
> +where otherwise MTRR write-combining would not be effective.
> +
> +Note that you cannot use set_memory_wc() to override / whitelist IO remapped
> +memory space mapped with ioremap*() calls, set_memory_wc() can only be used
> +on RAM.
> +
> +----------------------------------------------------------------------
> +MTRR Non-PAT PAT Linux ioremap value Effective memory type
> +----------------------------------------------------------------------
> + Non-PAT | PAT
> + PAT
> + |PCD
> + ||PWT
> + |||
> +WC 000 WB _PAGE_CACHE_MODE_WB WC | WC
> +WC 001 WC _PAGE_CACHE_MODE_WC WC* | WC
> +WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS WC* | WC
> +WC 011 UC _PAGE_CACHE_MODE_UC UC | UC
> +----------------------------------------------------------------------
> +
> +(*) denotes implementation defined and is discouraged
>
> Notes:
>
> diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
> index ea5f363..12abdbe 100644
> --- a/arch/x86/kernel/cpu/mtrr/main.c
> +++ b/arch/x86/kernel/cpu/mtrr/main.c
> @@ -538,6 +538,9 @@ EXPORT_SYMBOL(mtrr_del);
> * attempts to add a WC MTRR covering size bytes starting at base and
> * logs an error if this fails.
> *
> + * The caller should expect to need to provide a power of two size on an
> + * equivalent power of two boundary.
> + *
> * Drivers must store the return value to pass to mtrr_del_wc_if_needed,
> * but drivers should not try to interpret that return value.
> */
> --
> 2.3.2.209.gd67f9d5.dirty
>

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/