Re: [BUG] Bisected Problem with LSI PCI FC Adapter

From: Andreas Noever
Date: Mon Sep 22 2014 - 10:53:32 EST


On Mon, Sep 22, 2014 at 4:25 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote:
> On Sat, Sep 20, 2014 at 12:41 PM, Dirk Gouders <dirk@xxxxxxxxxxx> wrote:
>> Bjorn Helgaas <bhelgaas@xxxxxxxxxx> writes:
>>
>>> On Sat, Sep 13, 2014 at 09:41:34PM +0200, Dirk Gouders wrote:
>>>> So, I did some tests on the VX50 which probably wasn't the worst idea,
>>>> because it behaves different than the test machine.
>>>>
>>>> Summary:
>>>>
>>>> 1) Bjorn's back pocket patch works on the VX50.
>>>>
>>>> On the test machine it causes a trace, mount_root has to do with
>>>> it. I tried to use netconsole but it complained the interface were
>>>> not ready.
>>>
>>> OK, that's good. I put this revert patch in for-linus for v3.17. I regard
>>> this as a temporary fix, not the real solution. My guess is the test
>>> machine doesn't boot because you're missing a driver, so not related to the
>>> revert patch.
>>
>> I assumed my limit-host-bridge-aperture-and-ignore-bridges-patch on top
>> of your patch caused this, so I took a closer look.
>>
>> Your patch works fine with current rc5+ on the test machine -- with and
>> without my additional patch.
>
> Great, thanks for testing that!
>
>> Other various today's test results (VX50) will be appended to bugzilla
>> in a few moments.
>
> The Windows Server 2008 boot shows that Windows reconfigures the
> 00:0e.0 bridge so it fits inside the [bus 00-07] aperture reported by
> the host bridge _CRS, and the LSI FC adapter is not enumerated at all.
> That's basically the same behavior that prompted your bug report.
> This suggests that Windows does *not* reset the secondary bus when
> changing the bridge configuration.
>
> For v3.17, I reverted 1820ffdccb9b ("PCI: Make sure bus number
> resources stay within their parents bounds"). For the future, I think
> we should do a quirk to fix the _CRS, similar to what Andreas has
> posted, and apply 1820ffdccb9b again.
>
> But I think the quirk should be specific to this system and BIOS. I
> don't want to add a workaround that silently covers up Linux and BIOS
> bugs. The reason amd_bus.c exists is because Linux was not smart
> enough to pay attention to _CRS. Linux is now pretty good at that, so
> the reason for amd_bus.c is mostly gone. I don't want to add new
> dependencies on amd_bus.c that will prevent us from phasing it out.
Why not always trust amd_bus over _CRS? Is there a scenario in which
amd_bus is wrong?

Are these methods (like _CRS) meant to set limits for us, or are they
simply used to report the configuration decisions made by the BIOS? So
if _CRS says that the window is [00-07] would it be ok for us to
simply increase it (possibly after reprogramming the registers in
amd_bus)?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/