Re: [PATCH] PCI: Blacklist AMD Stoney GPU devices for ATS

From: Bjorn Helgaas
Date: Tue Apr 04 2017 - 12:43:23 EST


Hi Joerg,

On Tue, Mar 28, 2017 at 02:16:44PM +0200, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@xxxxxxx>
>
> ATS is broken on these devices. Under invalidation load, the
> GPU does not reply to invalidations anymore, causing
> Completion-wait loop timeouts on the AMD IOMMU driver side.
> Fix it by not enabling ATS on these devices.
>
> Note that below mentioned commit is not broken, it just
> triggers the issue because it might cause invalidation
> storms on devices.
>
> Fixes: b1516a14657a ('iommu/amd: Implement flush queue')
> Reported-by: Daniel Drake <drake@xxxxxxxxxxxx>
> Cc: Daniel Drake <drake@xxxxxxxxxxxx>
> Cc: Alexander Deucher <Alexander.Deucher@xxxxxxx>
> Signed-off-by: Joerg Roedel <jroedel@xxxxxxx>
> ---
> drivers/pci/ats.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index eeb9fb2..711bdb2 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -17,10 +17,18 @@
>
> #include "pci.h"
>
> +static const struct pci_device_id broken_ats_tbl[] = {
> + { PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x98e4) }, /* AMD Stoney GPU part */
> + { 0 }
> +};
> +
> void pci_ats_init(struct pci_dev *dev)
> {
> int pos;
>
> + if (pci_match_id(broken_ats_tbl, dev))
> + return;

This is fine functionally, but from a stylistic point of view, I guess
I would prefer to have it implemented in drivers/pci/quirks.c just to
have some consistency in how we work around device defects.

> pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ATS);
> if (!pos)
> return;
> --
> 1.9.1
>