Re: [PATCH] arm64: cpufeature: check translation granule size based on kernel config

From: Leo Yan
Date: Thu May 18 2017 - 09:03:10 EST


On Thu, May 18, 2017 at 01:41:15PM +0100, Mark Rutland wrote:
> On Thu, May 18, 2017 at 01:38:41PM +0100, Will Deacon wrote:
> > On Thu, May 18, 2017 at 07:36:27PM +0800, Leo Yan wrote:
> > > On Thu, May 18, 2017 at 11:39:01AM +0100, Suzuki K Poulose wrote:
> > > > On 18/05/17 11:21, Leo Yan wrote:
> > > > >In the big.LITTLE system with two clusters, one is CA53 cluster and
> > > > >another is CA73 cluster. CA53 doesn't support 16KB memory translation
> > > > >granule size (4.3.21 AArch64 Memory Model Feature Register 0, EL1; ARM
> > > > >DDI 0500F), but CA73 supports this feature (4.3.27 AArch64 Memory Model
> > > > >Feature Register 0, EL1; ARM 100048_0002_04_en). As result, the kernel
> > > > >reports log for "Unexpected variation" as below.
> > > > >
> > > > >[ 0.182113] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64MMFR0_EL1. Boot CPU: 0x00000000001122, CPU4: 0x00000000101122
> > > >
> > > > >
> > > > >This patch is to change the checking CPU feature for memory translation
> > > > >granule size based on kernel configuration. If kernel configuration has
> > > > >selected to use one specific memory translation granule size, then we
> > > > >will do strict sanity checking cross all CPUs. Otherwise we can skip to
> > > > >check unused features for memory translation granule size if kernel
> > > > >doesn't use it.
> > > > >
> > > >
> > > > If we were to suppress the warning (more on that below), we could simply
> > > > make this feature a NON_STRICT, since the unsupported CPUs won't boot
> > > > with 16K to hit this sanity check.
> > > >
> > > > However, there is a problem with disabling this warning. If a VM starts
> > > > using 16KB page size on a 4K/64K host, the VM could end up in unknown
> > > > failures when it switches to an unsupported CPU (after it has booted).
> > > > Of course the real fix lies in making the KVM exposing the safe value
> > > > for granule support to the VCPUs (which is currently being worked on by
> > > > Douglas in Cc). So, when we have that ready, we could make it NON_STRICT
> > > > instead of this approach.
> > >
> > > Thanks for the info :)
> > >
> > > I will use below patch for production branch temporarily. You could
> > > work out one formal patch for upstreaming when the dependency patches
> > > are get ready:
> >
> > The other thing we could do is change the way we taint on mismatch so that
> > we don't dump the scary (and pointless) backtrace.
>
> The backtrace is due to the WARN part of the WARN_TAINT_ONCE.
>
> We could chagne that as below, if we really want to get rid of the backtrace.
>
> Thanks,
> Mark.
>
> ---->8----
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 94b8f7f..1f53314 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -639,8 +639,8 @@ void update_cpu_features(int cpu,
> * Mismatched CPU features are a recipe for disaster. Don't even
> * pretend to support them.
> */
> - WARN_TAINT_ONCE(taint, TAINT_CPU_OUT_OF_SPEC,
> - "Unsupported CPU feature variation.\n");
> + pr_warn_once("Unsupported CPU feature variation detected.\n");
> + add_taint(TAINT_CPU_OUT_OF_SPEC);

Should be add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK)?

> }
>
> u64 read_sanitised_ftr_reg(u32 id)
>