I can imagine a bunch of ways around this.
One option is to hook in a check for buggy RMRRs in intel-iommu.c. If
the base and end are 0, just ignore the entry. That works for my
specific buggy DMAR entry. There might be other buggy entries out
there. The docs specify some requirements on the base and end (called
limit) addresses.
Another option is to change the sanity check so that unmapped ranges are
considered OK. That would work for my case, but only because we're
hiding the firmware bug: my DMAR has a bad RMRR that happens to fall into a
reserved or non-existent range. The downside here is that we'd
presumably be setting up an IOMMU mapping for this bad RMRR. But at
least it's not pointing to any RAM we're using. (That's actually what
goes on in the current, non-kexec case for me. Phys page 0 is marked
RESERVED, and I have an RMRR that points to it.) This option also would
cover any buggy firmwares that use an actual RMRR that pointed to memory
that was omitted from the e820 map.
A third option: whenever the RMRR sanity check fails, just ignore it and
return 0. Don't set up the rmrru. Right now, we completely abort DMAR
processing. If there was an actual device that needed to poke this
memory that failed the sanity check (meaning, not RESERVED, currently),
then we're already in trouble; that device could clobber RAM, right? If
we're going to use the IOMMU, I'd rather the device be behind an IOMMU
with*no* mapping for the region, so it couldn't clobber whatever we
happened to put in that location.
I actually think all three options are reasonable ideas independently of
one another. This patchset that does all three. Please take at least
one of them. =) (May require a slight revision if you don't take all
of them).