Re: HH DL585 warm boot fail (old)
From: Bjorn Helgaas
Date: Wed Oct 24 2018 - 09:49:51 EST
On Wed, Oct 24, 2018 at 10:47:24AM +0300, Meelis Roos wrote:
> > Would you mind opening a report at https://bugzilla.kernel.org? I'm
> > not sure if anybody will be able to do anything about this, but it's
> > always possible.
>
> Submitted now, https://bugzilla.kernel.org/show_bug.cgi?id=201503
>
> > A complete dmesg log and "sudo lspci -vv" output from a successful
> > boot would be a good start. And if you have a screenshot of the
> > failure, that would help, too. You can use the "ignore_loglevel"
> > kernel parameter to make sure we see everything on the console.
>
> Added.
>
> > Does this machine have an iLO? If so, it may have logs that
> > could be useful if this is related to some sort of bus error.
>
> Nothing in the ILO logs.
Great, thanks!
Can you try the patch below? This is extracted from the code here:
https://github.com/joyent/illumos-joyent/blob/b6a0b04d591f5b877cfe05f45e81f0e8a5cfc2b3/usr/src/uts/intel/io/pci/pci_boot.c#L1805
I'm not sure why this would be only an intermittent problem, but at
least we can see if this is related.
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6bc27b7fd452..842f900ed194 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5113,3 +5113,15 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MICROSEMI, 0x8575,
quirk_switchtec_ntb_dma_alias);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MICROSEMI, 0x8576,
quirk_switchtec_ntb_dma_alias);
+
+static void quirk_amd_8111(struct pci_dev *pdev)
+{
+ u8 ioc;
+
+ pci_read_config_byte(pdev, 0x40, &ioc);
+ if (ioc & 0x80) {
+ pci_info(pdev, "disabling NMI on error\n");
+ pci_write_config_byte(pdev, 0x40, ioc & ~0x80);
+ }
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7468, quirk_amd_8111);