For health monitoring, it can be useful to know if iommu is behaving as
expected. DMAR faults can be an indicator that a device:
- has been misconfigured, or
- has experienced a hardware hiccup and replacement should
be considered, or
- has been issuing faults due to malicious activity
Currently the only way to check if there were any DMAR faults on the
host is to scan the dmesg output. However this approach is not very
elegant. The information we are looking for can be wrapped out of the
buffer, or masked (since it is a rate-limited print) by another
device.
The series adds counters for DMAR faults and exposes them via sysfs.
Yuri Volchkov (2):
iommu/dmar: collect fault statistics
iommu/dmar: catch early fault occurrences
drivers/iommu/dmar.c | 182 ++++++++++++++++++++++++++++++++----
drivers/iommu/intel-iommu.c | 1 +
drivers/pci/pci-sysfs.c | 20 ++++
include/linux/intel-iommu.h | 4 +
include/linux/pci.h | 11 +++
5 files changed, 201 insertions(+), 17 deletions(-)