Re: [PATCH] EDAC, amd64: Add Family 17h Model 11h support.

From: Michael Jin
Date: Tue Aug 14 2018 - 20:21:45 EST


On Tue, Aug 14, 2018 at 4:26 PM, Ghannam, Yazen <Yazen.Ghannam@xxxxxxx> wrote:
>
> > -----Original Message-----
> > From: Michael Jin <mikhail.jin@xxxxxxxxx>
> > Sent: Friday, August 10, 2018 2:36 PM
> > To: Borislav Petkov <bp@xxxxxxx>; Ghannam, Yazen
> > <Yazen.Ghannam@xxxxxxx>; Mauro Carvalho Chehab
> > <mchehab@xxxxxxxxxx>
> > Cc: linux-edac@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Michael Jin
> > <mikhail.jin@xxxxxxxxx>
> There may be some differences between models, but things should generally
> work. It's just a matter of whether or not the Platform enables certain things
> like DRAM ECC, etc.
>
> Does the amd64_edac_mod module load on your platform with just this patch?
>
> > + fam_type = &family_types[F17_M11H_CPUS];
> > + pvt->ops = &family_types[F17_M11H_CPUS].ops;
> > + break;
> > + }
> > fam_type = &family_types[F17_CPUS];
> > pvt->ops = &family_types[F17_CPUS].ops;
> > break;

Yes, I was able to load amd64_edac_mod on my AMD Ryzen Embedded V1807B
(further details on my blog -
https://ndimcomputing.io/fs-fp5v_ecc_linux.html).

> > diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
> > index 1d4b74e9a037..e50226cd53c6 100644
> > --- a/drivers/edac/amd64_edac.h
> > +++ b/drivers/edac/amd64_edac.h
> > @@ -115,6 +115,8 @@
> > #define PCI_DEVICE_ID_AMD_16H_M30H_NB_F2 0x1582
> > #define PCI_DEVICE_ID_AMD_17H_DF_F0 0x1460
> > #define PCI_DEVICE_ID_AMD_17H_DF_F6 0x1466
> > +#define PCI_DEVICE_ID_AMD_17H_M11H_DF_F0 0x15e8
> > +#define PCI_DEVICE_ID_AMD_17H_M11H_DF_F6 0x15ee
> >
>
> These IDs are used for Fam17h Models 10h-2Fh. Can you please change
> the names here and in the rest of this patch?
>
> The format is to use the first supported model in the name, e.g. M11H -> M10H.
>
> > /*
> > * Function 1 - Address Map
> > @@ -281,6 +283,7 @@ enum amd_families {
> > F16_CPUS,
> > F16_M30H_CPUS,
> > F17_CPUS,
> > + F17_M11H_CPUS,
> > NUM_FAMILIES,
> > };
> >

According to https://en.wikichip.org/wiki/amd/cpuid, family 17h model
10h is not publicly known.

Therefore, I would like you to confirm that model 10h uses 0x15e8
(device F0) and 0x15ee (device F6) as I can not find any documentation
or test whether ECC works.

Raven Ridge BIOS motherboards do not enable ECC, but share the same
CPUID (0081_0F10h
http://www.cpu-world.com/CPUs/Zen/AMD-Ryzen%205%202400G.html) as AMD
Ryzen Embedded V1000 Processor Family.

This patch was written due to the fact that amd64_edac_mod erroneously
sets the wrong device ids for the AMD Ryzen Embedded V1000 Family
(device F0 and F6 fail to load, as they do not share 0x1460 and 0x1466
with the other family 17h processors).

Michael