Re: [PATCH v2 2/2] Intel-IOMMU, intr-remap: source-id checking
From: Ingo Molnar
Date: Wed May 20 2009 - 04:22:25 EST
* Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
> Ingo Molnar <mingo@xxxxxxx> writes:
>
> > * Weidong Han <weidong.han@xxxxxxxxx> wrote:
> >
> >> To support domain-isolation usages, the platform hardware must be
> >> capable of uniquely identifying the requestor (source-id) for each
> >> interrupt message. Without source-id checking for interrupt
> >> remapping , a rouge guest/VM with assigned devices can launch
> >> interrupt attacks to bring down anothe guest/VM or the VMM itself.
> >>
> >> This patch adds source-id checking for interrupt remapping, and
> >> then really isolates interrupts for guests/VMs with assigned
> >> devices.
> >>
> >> Because PCI subsystem is not initialized yet when set up IOAPIC
> >> entries, use read_pci_config_byte to access PCI config space
> >> directly.
> >>
> >> Signed-off-by: Weidong Han <weidong.han@xxxxxxxxx>
> >> ---
> >> arch/x86/kernel/apic/io_apic.c | 6 +++
> >> drivers/pci/intr_remapping.c | 90 ++++++++++++++++++++++++++++++++++++++-
> >> drivers/pci/intr_remapping.h | 2 +
> >> include/linux/dmar.h | 11 +++++
> >> 4 files changed, 106 insertions(+), 3 deletions(-)
> >
> > Code structure looks nice now. (and i susect you have tested this on
> > real and relevant hardware?) I've Cc:-ed Eric too ... does this
> > direction look good to you too Eric?
>
> Being a major nitpick, I have to point out that the code is not
> structured to support other iommus, and I think AMD has one that
> can do this as well.
(Joerg Cc:-ed)
> The early pci reading of the bus is just wrong. What happens if
> the pci layer decided to renumber things? It looks like we have a
> real dependency on pci there and are avoiding sorting it out with
> this.
Yes ... but is there much we can do about this bootstrap dependency?
We want to enable the IO-APIC very early in its final form. There's
quite a bit of IRQ functionality that doesnt go via the PCI layer,
and which is being relied on by early bootup. The timer irq must
work, etc.
> Hmm. But that is what we use in setup_ioapic_sid.... I expect the
> right solution is to delay enabling ioapic entries until driver
> enable them. That could also reduce screaming irqs during bootup
> in the kdump case.
Yes, and note that we are moving in that direction in tip:irq/numa
(Yinghai Cc:-ed) - we are delaying IRQ setup to the very last
moment. We recently got rid of 32-bit IRQ compression in that branch
as well and the IRQ vectoring code is now unified between 64-bit and
32-bit so it's nify and you might want to check it and look for
holes ...
( The motivation there was different: delay allocation of the
irq_desc so that we can make it NUMA-local - but it has the same
general effect. )
> set_msi_sid looks wrong. The comment are unhelpful. irte->svt
> should get an enum value or a deine (removing the repeated
> explanations of the magic value) and then we could have room to
> explain why we are doing what we are doing.
(yes, it probably wants a helper method - i pointed these smaller
details out in my review.)
> Not finding an upstream pcie_bridge and then concluding we are a
> pcie device seems bogus.
>
> Why if we do have an upstream pcie bridge do we only want to do a
> bus range verification instead of checking just for the bus
> :devfn?
>
> The legacy PCI case seems even stranger.
Good points. Would be nice to get an ack from the PCI folks to make
sure these bits are sane.
> ....
>
> The table of apic information by apic_id also seems wrong. Don't
> we have chip_data or something that should point it that we can
> get from the irq?
->chip_data is already used for io-apic configuration bits - if it's
reused then the right way to do it is to extend struct irq_cfg in
arch/x86/kernel/apic/io_apic.c.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/