Re: [PATCH 1/3] iommu/vt-d: skip RMRR entries that fail the sanity check
From: Barret Rhoden
Date: Mon Dec 16 2019 - 14:35:31 EST
On 12/16/19 2:07 PM, Chen, Yian wrote:
On 12/11/2019 11:46 AM, Barret Rhoden wrote:
RMRR entries describe memory regions that are DMA targets for devices
outside the kernel's control.
RMRR entries that fail the sanity check are pointing to regions of
memory that the firmware did not tell the kernel are reserved or
otherwise should not be used.
Instead of aborting DMAR processing, this commit skips these RMRR
entries. They will not be mapped into the IOMMU, but the IOMMU can
still be utilized. If anything, when the IOMMU is on, those devices
will not be able to clobber RAM that the kernel has allocated from those
regions.
Signed-off-by: Barret Rhoden <brho@xxxxxxxxxx>
---
 drivers/iommu/intel-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index f168cd8ee570..f7e09244c9e4 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4316,7 +4316,7 @@ int __init dmar_parse_one_rmrr(struct
acpi_dmar_header *header, void *arg)
ÂÂÂÂÂ rmrr = (struct acpi_dmar_reserved_memory *)header;
ÂÂÂÂÂ ret = arch_rmrr_sanity_check(rmrr);
ÂÂÂÂÂ if (ret)
-ÂÂÂÂÂÂÂ return ret;
+ÂÂÂÂÂÂÂ return 0;
ÂÂÂÂÂ rmrru = kzalloc(sizeof(*rmrru), GFP_KERNEL);
ÂÂÂÂÂ if (!rmrru)
Parsing rmrr function should report the error to caller. The behavior to
response the error can be
chose by the caller in the calling stack, for example,
dmar_walk_remapping_entries().
A concern is that ignoring a detected firmware bug might have a
potential side impact though
it seemed safe for your case.
That's a little difficult given the current code. Once we are in
dmar_walk_remapping_entries(), the specific function (parse_one_rmrr) is
called via callback:
ret = cb->cb[iter->type](iter, cb->arg[iter->type]);
if (ret)
return ret;
If there's an error of any sort, it aborts the walk. Handling the
specific errors here is difficult, since we don't know what the errors
mean to the specific callback. Is there some errno we can use that
means "there was a problem, but it's not so bad that you have to abort,
but I figured you ought to know"? Not that I think that's a good idea.
The knowledge of whether or not a specific error is worth aborting all
DMAR functionality is best known inside the specific callback. The only
handling to do is print a warning and either skip it or abort.
I think skipping the entry for a bad RMRR is better than aborting
completely, though I understand if people don't like that. It's
debatable. By aborting, we lose the ability to use the IOMMU at all,
but we are still in a situation where the devices using the RMRR regions
might be clobbering kernel memory, right? Using the IOMMU (with no
mappings for the bad RMRRs) would stop those devices from clobbering memory.
Regardless, I have two other patches in this series that could resolve
the problem for me and probably other people. I'd just like at least
one of the three patches to get merged so that my machine boots when the
original commit f036c7fa0ab6 ("iommu/vt-d: Check VT-d RMRR region in
BIOS is reported as reserved") gets released.
Thanks,
Barret