RE: [RFC PATCH] x86/cpufeatures: Enumerate new AVX512 bfloat16 instructions
From: Yu, Fenghua
Date: Tue Jun 11 2019 - 23:34:34 EST
> On Tuesday, June 11, 2019 3:28 PM, Fenghua Yu wrote:
> On Tue, Jun 11, 2019 at 09:47:02PM +0200, Borislav Petkov wrote:
> > On Tue, Jun 11, 2019 at 11:19:20AM -0700, Fenghua Yu wrote:
> > > So can I re-organize word 11 and 12 as follows?
> > >
> > > 1. Change word 11 to host scattered features.
> > > 2. Move the previos features in word 11 and word 12 to word 11:
> > > /*
> > > * Extended auxiliary flags: Linux defined - For features scattered
> > > in various
> > > * CPUID levels and sub-leaves like CPUID level 7 and sub-leaf 1, etc,
> word 19.
> > > */
> > > #define X86_FEATURE_CQM_LLC (11*32+ 0) /* LLC QoS if 1 */
> > > #define X86_FEATURE_CQM_OCCUP_LLC (11*32+ 1) /* LLC
> occupancy monitoring */
> > > #define X86_FEATURE_CQM_MBM_TOTAL (11*32+ 2) /* LLC Total
> MBM monitoring */
> > > #define X86_FEATURE_CQM_MBM_LOCAL (11*32+ 3) /* LLC Local
> MBM monitoring */
> >
> > Yap.
> >
> > > 3. Change word 12 to host CPUID.(EAX=7,ECX=1):EAX:
> > > /* Intel-defined CPU features, CPUID level 0x7:1 (EAX), word 12 */
> > > #define X86_FEATURE_AVX512_BF16 (12*32+ 0) /* BFLOAT16
> instructions */
> >
> > This needs to be (12*32+ 5) if word 12 is going to map leaf
> > CPUID.(EAX=7,ECX=1):EAX.
> >
> > At least judging from the arch extensions doc which lists EAX as:
> >
> > Bits 04-00: Reserved.
> > Bit 05: AVX512_BF16. Vector Neural Network Instructions supporting
> BFLOAT16 inputs and conversion instructions from IEEE single precision.
> > Bits 31-06: Reserved.
>
> Yes, you are absolutely right. I'll defint it as (12*32+ 5).
>
> >
> > > 4. Do other necessary changes to match the new word 11 and word 12.
> >
> > But split in two patches: first does steps 1+2, second patch adds the
> > new leaf to word 12.
>
> There are two varialbes defined in cpuinfo_x86: x86_cache_max_rmid and
> x86_cache_occ_scale. c->x86_cache_max_rmid is read from CPUID.0xf.1:ECX
> and c->x86_cache_occ_scale is read from CPUID.0xf.1:EBX.
>
> After getting X86_FEATURE_CQM_* from scattered, the two variables need
> to be read from CPUID again. So the code of reading the two variables need
> to be moved from before init_scattered_cpuid_features(c) to after the
> function. This make the get_cpu_cap() code awkward.
>
> And the two variables are ONLY used in resctrl monitoring configuration.
> There is no need to store them in cpuinfo_x86 on each CPU.
>
> I'm thinking to simplify and clean this part of code:
>
> 1. In patch #1:
> - remove the definitions of x86_cache_max_rmid and x86_cache_occ_scale
> from cpuinfo_x86
> - remove assignment of c->x86_cache_max_rmid and c-
> >x86_cache_occ_scale from get_cpu_cap(c)
> - get r->mon_scale and r->num_rmid in rdt_get_mon_l3_config(r) directly
> from CPUID.0xf.1:EBX and CPUID.0xf.1:ECX.
> 2. In patch #2: do steps 1+2 to recycle word 11. After patch #1, I can totally
> remove the code to get c->x86_cache_max_rmd and
> c->x86_cache_occ_scale in get_cpu_cap(c). And patch #2 is cleaner.
> 3. In patch #3: add new word 12 to host CPUID.7.1:EAX
>
> Do you think the patch #1 is necessary and this is a right patch set?
My bad. I studied a bit more and found the patch #1 is not needed. Please ignore my last email.
Thanks.
-Fenghua