Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmwarebug question

From: Bjorn Helgaas
Date: Mon Nov 26 2012 - 20:11:47 EST


On Mon, Nov 26, 2012 at 6:00 PM, Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx> wrote:
>
>
> -----Original Message-----
> From: Bjorn Helgaas [mailto:bhelgaas@xxxxxxxxxx]
> Sent: Monday, November 26, 2012 8:00 PM
> To: Bruno Prémont
> Cc: Justin Piszcz; support@xxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Dan
> Williams
> Subject: Re: Supermicro X9SRL-F - channel enumeration error & ACPI/firmware
> bug question
>
> [Try Dan's current email address; sorry Dan]
>
> On Mon, Nov 26, 2012 at 5:56 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote:
>> [+cc Dan]
>>
>> On Mon, Nov 26, 2012 at 2:42 PM, Bruno Prémont
>> <bonbons@xxxxxxxxxxxxxxxxx> wrote:
>>> Hi Justin,
>>>
>>> On Sat, 24 November 2012 "Justin Piszcz" wrote:
>>>> Is the following normal on an X9SRL-F board (bios 1.0a)?
>>>>
>>>> In the manual it states:
>>>>
>>>> Data Direct I/O
>>>> Select Enabled to enable Intel I/OAT (I/O Acceleration Technology),
> which
>>>> significantly reduces CPU overhead by leveraging CPU architectural
>>>> improvements and freeing the system resource for other tasks. The
> options
>>>> are Disabled and Enabled.
>>>>
>>>> Default is Enabled.
>>>>
>>>> When enabled in the kernel, I see the following:
>>>>
>>>> [ 0.696357] ioatdma: Intel(R) QuickData Technology Driver 4.00
>>>> [ 0.696487] ioatdma 0000:00:04.0: channel error register unreachable
>>>> [ 0.696546] ioatdma 0000:00:04.0: channel enumeration error
>>>> [ 0.696604] ioatdma 0000:00:04.0: Intel(R) I/OAT DMA Engine init
> failed
>>>> [ 0.696721] ioatdma 0000:00:04.1: channel error register unreachable
>>>> [ 0.696779] ioatdma 0000:00:04.1: channel enumeration error
>>>> [ 0.697522] ioatdma 0000:00:04.1: Intel(R) I/OAT DMA Engine init
> failed
>>>> [ 0.697617] ioatdma 0000:00:04.2: channel error register unreachable
>>>> [ 0.697681] ioatdma 0000:00:04.2: channel enumeration error
>>>> [ 0.697739] ioatdma 0000:00:04.2: Intel(R) I/OAT DMA Engine init
> failed
>>>> [ 0.697831] ioatdma 0000:00:04.3: channel error register unreachable
>>>> [ 0.697890] ioatdma 0000:00:04.3: channel enumeration error
>>>> [ 0.697948] ioatdma 0000:00:04.3: Intel(R) I/OAT DMA Engine init
> failed
>>>> [ 0.698037] ioatdma 0000:00:04.4: channel error register unreachable
>>>> [ 0.698095] ioatdma 0000:00:04.4: channel enumeration error
>>>> [ 0.698153] ioatdma 0000:00:04.4: Intel(R) I/OAT DMA Engine init
> failed
>>>> [ 0.698245] ioatdma 0000:00:04.5: channel error register unreachable
>>>> [ 0.698303] ioatdma 0000:00:04.5: channel enumeration error
>>>> [ 0.698360] ioatdma 0000:00:04.5: Intel(R) I/OAT DMA Engine init
> failed
>>>> [ 0.698449] ioatdma 0000:00:04.6: channel error register unreachable
>>>> [ 0.698508] ioatdma 0000:00:04.6: channel enumeration error
>>>> [ 0.698565] ioatdma 0000:00:04.6: Intel(R) I/OAT DMA Engine init
> failed
>>>> [ 0.698676] ioatdma 0000:00:04.7: channel error register unreachable
>>>> [ 0.698735] ioatdma 0000:00:04.7: channel enumeration error
>>>> [ 0.698792] ioatdma 0000:00:04.7: Intel(R) I/OAT DMA Engine init
> failed
>>>>
>>>> --
>>>>
>>>> Also, I tried using ASPM (enabled in BIOS), but since ACPI Linux query
> is
>>>> ignored, it fails to work:
>>>> [ 0.562229] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored
>>>>
>>>> I assume this is something Supermicro has to fix?
>>>
>>> You are probably missing some kernel config option(s) :) - I did fight
> similar
>>> issues on a Fujitsu SandyBridge Xeon based server.
>>>
>>> Check if enabling CONFIG_X86_X2APIC helps as well as other APIC/IOMMU
> options.
>>
>> Changing config options is not a valid fix for error messages like
>> this. We should be able to make the config smarter by adding
>> dependencies or something, or else make the driver smart enough to
>> give a more useful diagnostic.
>>
>> The "channel error register unreachable" message indicates that
>> pci_read_config_dword() failed. The register in question
>> (IOAT_PCI_CHANERR_INT_OFFSET) is at 0x180, so possibly we don't have
>> PCI config accessors for the extended config space (0x100-0xfff). A
>> complete dmesg log should show that.
>
> --
>
> Here is the full dmesg: (I went back to my older kernel, let me know if you
> need a dmesg w/ those options enabled)
> http://home.comcast.net/~jpiszcz/20121126/dmesg.txt

It looks like maybe you don't have CONFIG_PCI_MMCONFIG turned on?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/