Re: [External] Re: [PATCH] iommu/vt-d: fix system hang on reboot -f

From: Ethan Zhao
Date: Sun Feb 23 2025 - 22:21:49 EST



在 2025/2/21 17:46, yunhui cui 写道:
Hi Ethan,

On Fri, Feb 21, 2025 at 4:40 PM Ethan Zhao <haifeng.zhao@xxxxxxxxxxxxxxx> wrote:

在 2025/2/20 18:15, Yunhui Cui 写道:
When entering intel_iommu_shutdown, system interrupts are disabled,
System interrupts were disabled ? you mean all interrupts were disabled
when entering intel_iommu_shutdown(), perhaps it is not true, at least
for upstream latest code.

and the reboot process might be scheduled out by down_write(). If the
scheduled process does not yield (e.g., while(1)), the system will hang.
No NMI lockup watchdog jumping out here ?
Steps to reproduce:

1. Avoid return in:
if (no_iommu || dmar_disabled)
return;

2. Write a.out with while(1).

3. ./a.out &; reboot -f.

4. Observe. Send NMI via BIOS to check system response.

Via BMC ? There is 'NMI' hardware physical button on some machines to trigger

NMI to OS for diagnostic purpose, you could check your box for that. but no luck,

there is no NMI trigger in my GNR BMC.


Thanks,
Ethan


5. Add console=ttyS0,115200 to cmdline to increase reproduction chance.

Let's continue discussing based on the above.

Thanks,
Ethan

Signed-off-by: Yunhui Cui <cuiyunhui@xxxxxxxxxxxxx>
---
drivers/iommu/intel/iommu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index cc46098f875b..76a1d83b46bf 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2871,7 +2871,8 @@ void intel_iommu_shutdown(void)
if (no_iommu || dmar_disabled)
return;

- down_write(&dmar_global_lock);
+ if (!down_write_trylock(&dmar_global_lock))
+ return;

/* Disable PMRs explicitly here. */
for_each_iommu(iommu, drhd)
--
"firm, enduring, strong, and long-lived"

Thanks,
Yunhui

--
"firm, enduring, strong, and long-lived"