Re: [PATCH] sb_edac: Fix detection on SNB machines

From: Andy Lutomirski
Date: Sun Feb 08 2015 - 11:54:35 EST


On Feb 5, 2015 4:37 AM, "Borislav Petkov" <bp@xxxxxxxxx> wrote:
>
> On Thu, Feb 05, 2015 at 12:50:35PM +0100, Borislav Petkov wrote:
> > From: Borislav Petkov <bp@xxxxxxx>
> >
> > Commit 50d1bb93672f ("sb_edac: add support for Haswell based systems")
> > broke the driver on my SNB box with PCI ID 0x3ca0:
> >
> > 3f:0e.0 System peripheral: Intel Corporation Xeon E5/Core i7 Processor Home Agent (rev 07)
> > 00: 86 80 a0 3c 00 00 00 00 07 00 80 08 00 00 80 00
> > ...
> >
> > because its probe routine gets handed in pdev->device: 0x3ca0 but we're
> > matching for 0x3ca8, i.e. PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TA and the
> > probing fails.
>
> Or was it Andy who broke it?!
>
> From looking at
>
> d0585cd815fa ("sb_edac: Claim a different PCI device")
>
> static const struct pci_device_id sbridge_pci_tbl[] = {
> - {PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TA)},
> + {PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_HA0)},
>
> Yeah, it looks like it:
>
> We're working with the IMC_TA device for matching but we get handed in
> the IMC_HA0 device after this patch which confirms with my observations.
>
> tztztz, people, can we decide on one PCI device and stick with it?

Can you send your CPU model and the full output of lspci -nn.

If there isn't an obvious fix, I'd be okay with reverting, too. The
i2c stuff is delayed because the outcome of my investigation is that
it has real problems, and it'll probably have to wait until later this
year, or a scary module param, to deal with it.

--Andy

>
> :-)
>
> [ Leaving in the rest for Andy. ]
>
> > Adding 0x3ca0 fixes the issue and driver loads successfully again:
> >
> > [ 2449.013120] EDAC DEBUG: sbridge_init:
> > [ 2449.017029] EDAC sbridge: Seeking for: PCI ID 8086:3ca0
> > [ 2449.022368] EDAC DEBUG: sbridge_get_onedevice: Detected 8086:3ca0
> > [ 2449.028498] EDAC sbridge: Seeking for: PCI ID 8086:3ca0
> > [ 2449.033768] EDAC sbridge: Seeking for: PCI ID 8086:3ca8
> > [ 2449.039028] EDAC DEBUG: sbridge_get_onedevice: Detected 8086:3ca8
> > [ 2449.045155] EDAC sbridge: Seeking for: PCI ID 8086:3ca8
> > ...
> >
> > Add a debug printk while at it to be able to catch the failure in the
> > future and dump driver version on successful load.
> >
> > Cc: Tony Luck <tony.luck@xxxxxxxxx>
> > Cc: Aristeu Rozanski <aris@xxxxxxxxxx>
> > Cc: Mauro Carvalho Chehab <m.chehab@xxxxxxxxxxx>
> > Signed-off-by: Borislav Petkov <bp@xxxxxxx>
> > ---
> > drivers/edac/sb_edac.c | 8 ++++++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
> > index 63aa6730e89e..2e1c03a64751 100644
> > --- a/drivers/edac/sb_edac.c
> > +++ b/drivers/edac/sb_edac.c
> > @@ -2448,6 +2448,7 @@ static int sbridge_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > type = IVY_BRIDGE;
> > break;
> > case PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TA:
> > + case PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_HA0:
> > rc = sbridge_get_all_devices(&num_mc, pci_dev_descr_sbridge_table);
> > type = SANDY_BRIDGE;
> > break;
> > @@ -2460,8 +2461,11 @@ static int sbridge_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > type = BROADWELL;
> > break;
> > }
> > - if (unlikely(rc < 0))
> > + if (unlikely(rc < 0)) {
> > + edac_dbg(0, "couldn't get all devices for 0x%x\n", pdev->device);
> > goto fail0;
> > + }
> > +
> > mc = 0;
> >
> > list_for_each_entry(sbridge_dev, &sbridge_edac_list, list) {
> > @@ -2474,7 +2478,7 @@ static int sbridge_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > goto fail1;
> > }
> >
> > - sbridge_printk(KERN_INFO, "Driver loaded.\n");
> > + sbridge_printk(KERN_INFO, "%s\n", SBRIDGE_REVISION);
> >
> > mutex_unlock(&sbridge_edac_lock);
> > return 0;
> > --
> > 2.2.0.33.gc18b867
> >
> >
>
> --
> Regards/Gruss,
> Boris.
>
> ECO tip #101: Trim your mails when you reply.
> --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/