RE: [PATCH] PCI: Blacklist AMD Stoney GPU devices for ATS
From: Deucher, Alexander
Date: Tue Mar 28 2017 - 16:18:42 EST
> -----Original Message-----
> From: Joerg Roedel [mailto:joro@xxxxxxxxxx]
> Sent: Tuesday, March 28, 2017 8:17 AM
> To: Bjorn Helgaas
> Cc: linux-pci@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Joerg Roedel;
> Daniel Drake; Deucher, Alexander
> Subject: [PATCH] PCI: Blacklist AMD Stoney GPU devices for ATS
>
> From: Joerg Roedel <jroedel@xxxxxxx>
>
> ATS is broken on these devices. Under invalidation load, the
> GPU does not reply to invalidations anymore, causing
> Completion-wait loop timeouts on the AMD IOMMU driver side.
> Fix it by not enabling ATS on these devices.
>
> Note that below mentioned commit is not broken, it just
> triggers the issue because it might cause invalidation
> storms on devices.
>
> Fixes: b1516a14657a ('iommu/amd: Implement flush queue')
> Reported-by: Daniel Drake <drake@xxxxxxxxxxxx>
> Cc: Daniel Drake <drake@xxxxxxxxxxxx>
> Cc: Alexander Deucher <Alexander.Deucher@xxxxxxx>
> Signed-off-by: Joerg Roedel <jroedel@xxxxxxx>
Did you see Arindam's patch from yesterday[1]? Not sure which is the proper fix, maybe both?
Alex
[1] - https://lists.freedesktop.org/archives/amd-gfx/2017-March/006862.html
> ---
> drivers/pci/ats.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index eeb9fb2..711bdb2 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -17,10 +17,18 @@
>
> #include "pci.h"
>
> +static const struct pci_device_id broken_ats_tbl[] = {
> + { PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x98e4) }, /* AMD Stoney GPU
> part */
> + { 0 }
> +};
> +
> void pci_ats_init(struct pci_dev *dev)
> {
> int pos;
>
> + if (pci_match_id(broken_ats_tbl, dev))
> + return;
> +
> pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ATS);
> if (!pos)
> return;
> --
> 1.9.1