Re: [PATCH] PCI: add a quirk for keeping Bus Master bit on shutdown

From: Bjorn Helgaas
Date: Mon Nov 25 2013 - 23:20:50 EST


On Mon, Nov 25, 2013 at 9:11 PM, Chang Liu <cl91tp@xxxxxxxxx> wrote:
> If this turns out to be a problem specific to Acer V573G, we can simply
> wrap the if (pdev->...) line in ata/ahci.c with if
> (dmi_check_system(acer_v573g))
> so that only acer v573g laptops will be affected by this patch. The remaining
> part of the patch won't need to be changed.

Well, sure. But that doesn't change the fact that until we know *why*
it hangs, this is voodoo programming. And I don't like voodoo
programming.

> 2013/11/26 Bjorn Helgaas <bhelgaas@xxxxxxxxxx>:
>> [+cc Lan, Khalid, Konstantin, Alan, Takao, Jility, Florian, linux-kernel]
>>
>> On Tue, Nov 12, 2013 at 07:40:03PM +0000, Chang Liu wrote:
>>> This fixes https://bugzilla.kernel.org/show_bug.cgi?id=63861
>>>
>>> Commit b566a22c2 and 7897e60227 made pci_device_shutdown()
>>> unconditionally clear Bus Master bit for every pci devices.
>>> While this works for most hardware, certain devices are not
>>> compatible with this. Intel Lynx Point-LP SATA Controller,
>>> for example, will hang the system if its Bus Master bit is
>>> cleared during device shutdown. This patch adds a pci quirk
>>> so that device drivers can instruct pci_device_shutdown()
>>> to keep Bus Master from being cleared, and then implements
>>> this mechanism for the Intel Lynx Point-LP AHCI driver.
>>>
>>> Signed-off-by: Chang Liu <cl91tp@xxxxxxxxx>
>>
>> I'm not 100% comfortable with disabling bus mastering in general;
>> see [1] for more details.
>>
>> But given that we have been doing it for quite a while (b566a22c2 appeared
>> in v3.5 on Jul 21, 2012), and Lynx Point should be in lots of machines, I'm
>> surprised that we don't see more reports of this problem if it really
>> affects all Lynx Point parts.
>>
>> Jility reported the same problem [2], and I did find one other similar
>> report [3] from Florian. All three reports (Chang, Jility, Florian) are on
>> the same exact machine (Acer V5-573G), which seems sort of suspicious.
>>
>> Lan, do you have access to any Lynx Point boxes? Can you test and see
>> whether they hang on power-off also? I suspect this might be something
>> specific to the Acer box, not a generic Lynx Point issue.
>>
>> Bjorn
>>
>> [1] http://lkml.kernel.org/r/20120427190033.GA17588@xxxxxxxxxxxxxx
>> [2] https://bugzilla.kernel.org/show_bug.cgi?id=63861#c21
>> [3] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1249329
>>
>>> ---
>>> As per Takao's suggestion, add a new member into struct pci_dev
>>> and add a quirk in the ahci driver. I tested this on my
>>> machine (Acer V5-573G) and it works fine.
>>>
>>> drivers/ata/ahci.c | 8 ++++++++
>>> drivers/pci/pci-driver.c | 11 ++++++++---
>>> include/linux/pci.h | 1 +
>>> 3 files changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
>>> index 8e28f92..de6efcb 100644
>>> --- a/drivers/ata/ahci.c
>>> +++ b/drivers/ata/ahci.c
>>> @@ -1385,6 +1385,14 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>>>
>>> pci_set_master(pdev);
>>>
>>> + /* We normally clear Bus Master on pci device shutdown. However,
>>> + * doing so for Intel Lynx Point-LP SATA Controller [AHCI mode]
>>> + * hangs the system. Therefore keep it.
>>> + * See bug report: https://bugzilla.kernel.org/show_bug.cgi?id=63861
>>> + */
>>> + if (pdev->vendor == PCI_VENDOR_ID_INTEL && pdev->device == 0x9c03)
>>> + pdev->keep_busmaster_on_shutdown = 1;
>>> +
>>> if (hpriv->flags & AHCI_HFLAG_MULTI_MSI)
>>> return ahci_host_activate(host, pdev->irq, n_msis);
>>>
>>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>>> index 38f3c01..ff15b0c 100644
>>> --- a/drivers/pci/pci-driver.c
>>> +++ b/drivers/pci/pci-driver.c
>>> @@ -392,10 +392,15 @@ static void pci_device_shutdown(struct device *dev)
>>> pci_msix_shutdown(pci_dev);
>>>
>>> /*
>>> - * Turn off Bus Master bit on the device to tell it to not
>>> - * continue to do DMA. Don't touch devices in D3cold or unknown states.
>>> + * If the hardware is okay with it, turn off Bus Master bit
>>> + * on the device to tell it not to continue doing DMA.
>>> + * Don't touch devices in D3cold or unknown states.
>>> + * On certain hardware clearing Bus Master bit on shutdown
>>> + * may hang the entire system. In these cases the driver of
>>> + * these devices should set keep_busmaster_on_shutdown to 1.
>>> */
>>> - if (pci_dev->current_state <= PCI_D3hot)
>>> + if (!pci_dev->keep_busmaster_on_shutdown
>>> + && pci_dev->current_state <= PCI_D3hot)
>>> pci_clear_master(pci_dev);
>>> }
>>>
>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>> index da172f9..63db735 100644
>>> --- a/include/linux/pci.h
>>> +++ b/include/linux/pci.h
>>> @@ -322,6 +322,7 @@ struct pci_dev {
>>> /* keep track of device state */
>>> unsigned int is_added:1;
>>> unsigned int is_busmaster:1; /* device is busmaster */
>>> + unsigned int keep_busmaster_on_shutdown:1; /* do not clear busmaster on shutdown */
>>> unsigned int no_msi:1; /* device may not use msi */
>>> unsigned int block_cfg_access:1; /* config space access is blocked */
>>> unsigned int broken_parity_status:1; /* Device generates false positive parity */
>>> --
>>> 1.8.4.2
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/