Re: [PATCH 1/1] Revert "iommu/vt-d: Fix possible recursive locking in intel_iommu_init()"

From: Baolu Lu
Date: Tue Sep 20 2022 - 08:36:53 EST


On 2022/9/20 20:16, Robin Murphy wrote:
On 2022-09-20 12:58, Thorsten Leemhuis wrote:
On 20.09.22 10:17, Lu Baolu wrote:
This reverts commit 9cd4f1434479f1ac25c440c421fbf52069079914.

Thx for taking care of this.

Some issues were reported on the original commit. Some thunderbolt devices
don't work anymore due to the following DMA fault.

DMAR: DRHD: handling fault status reg 2
DMAR: [INTR-REMAP] Request device [09:00.0] fault index 0x8080
       [fault reason 0x25]
       Blocked a compatibility format interrupt request

Bring it back for now to avoid functional regression.

Fixes: 9cd4f1434479f ("iommu/vt-d: Fix possible recursive locking in intel_iommu_init()")
Link: https://lore.kernel.org/linux-iommu/485A6EA5-6D58-42EA-B298-8571E97422DE@xxxxxxxxxxxxxxxxx/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216497

Both those reports were against 5.19.y, so this afaics should have a

Cc: <stable@xxxxxxxxxxxxxxx> # 5.19.x

to ensure it's backported.

Speaking of which: Joerg/Will/Robin, it seems quite a few people are
running into this, it hence would be great to get this quickly mainlined
(maybe by letting Linus pick it up straight from the list once ready?)
so stable can pick it up.

As a heads-up, a straight revert is likely to lead to people reporting lockdep warnings against -next, for the patches queued there which exposed this dodgy locking in the first place.

I plan to fix that lockdep warning with below patch:

https://github.com/LuBaolu/intel-iommu/commit/dff18af627a2a76651b74cd6531f3e9357a97072

It works on my test machines. I am about to test it with more hardware.


Does it work to just move the dmar_register_bus_notifier() call back to where it was, without undoing the rest of the patch? That seems like the change that's overwhelmingly likely to have broken IRQ remapping, and TBH it wasn't clear to me why the original patch moved it to begin with.

The callbacks of dmar_register_bus_notifier() possibly races with
intel_iommu_init(). So the offending commit had to move it down until
the Intel IOMMU initialization is done.

Best regards,
baolu