Re: [PATCH v2] PCI: Reset PCIe devices to stop ongoing DMA

From: Bjorn Helgaas
Date: Tue Jul 30 2013 - 11:59:45 EST

On Tue, Jul 30, 2013 at 12:09 AM, Takao Indoh
<indou.takao@xxxxxxxxxxxxxx> wrote:
> (2013/07/29 23:17), Bjorn Helgaas wrote:
>> On Sun, Jul 28, 2013 at 6:37 PM, Takao Indoh <indou.takao@xxxxxxxxxxxxxx> wrote:
>>> (2013/07/26 2:00), Bjorn Helgaas wrote:

>>>> My point about IOMMU and PCI initialization order doesn't go away just
>>>> because it doesn't fit "kdump policy." Having system initialization
>>>> occur in a logical order is far more important than making kdump work.
>>> My next plan is as follows. I think this is matched to logical order
>>> on boot.
>>> drivers/pci/pci.c
>>> - Add function to reset bus, for example, pci_reset_bus(struct pci_bus *bus)
>>> drivers/iommu/intel-iommu.c
>>> - On initialization, if IOMMU is already enabled, call this bus reset
>>> function before disabling and re-enabling IOMMU.
>> I raised this issue because of arches like sparc that enumerate the
>> IOMMU before the PCI devices that use it. In that situation, I think
>> you're proposing this:
>> panic kernel
>> enable IOMMU
>> panic
>> kdump kernel
>> initialize IOMMU (already enabled)
>> pci_reset_bus
>> disable IOMMU
>> enable IOMMU
>> enumerate PCI devices
>> But the problem is that when you call pci_reset_bus(), you haven't
>> enumerated the PCI devices, so you don't know what to reset.
> Right, so my idea is adding reset code into "intel-iommu.c". intel-iommu
> initialization is based on the assumption that enumeration of PCI devices
> is already done. We can find target device from IOMMU page table instead
> of scanning all devices in pci tree.
> Therefore, this idea is only for intel-iommu. Other architectures need
> to implement their own reset code.

That's my point. I'm opposed to adding code to PCI when it only
benefits x86 and we know other arches will need a fundamentally
different design. I would rather have a design that can work for all

If your implementation is totally implemented under arch/x86 (or in
intel-iommu.c, I guess), I can't object as much. However, I think
that eventually even x86 should enumerate the IOMMUs via ACPI before
we enumerate PCI devices.

It's pretty clear that's how BIOS designers expect the OS to work.
For example, sec 8.7.3 of the Intel Virtualization Technology for
Directed I/O spec, rev 1.3, shows the expectation that remapping
hardware (IOMMU) is initialized before discovering the I/O hierarchy
below a hot-added host bridge. Obviously you're not talking about a
hot-add scenario, but I think the same sequence should apply at
boot-time as well.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at