Re: [PATCH v2 2/5] x86/PCI: Support additional MMIO range capabilities

From: Bjorn Helgaas
Date: Mon Apr 28 2014 - 16:50:58 EST


[+cc Jan (24d9b70b8 author), Yinghai]

On Sat, Apr 26, 2014 at 3:10 AM, Borislav Petkov <bp@xxxxxxx> wrote:
> + Robert.
>
> On Fri, Apr 25, 2014 at 04:24:31PM -0600, Myron Stowe wrote:
>> On Sun, Apr 20, 2014 at 1:59 AM, Borislav Petkov <bp@xxxxxxx> wrote:
>> > Drop Andreas' old email address from CC as it keeps bouncing.
>> >
>> > On Sat, Apr 19, 2014 at 03:52:20PM +0200, Borislav Petkov wrote:
>> >> > -static void __init pci_enable_pci_io_ecs(void)
>> >> > +static void __init pci_enable_pci_io_ecs(u8 bus, u8 slot)
>> >> > {
>> >> > #ifdef CONFIG_AMD_NB
>> >> > unsigned int i, n;
>> >> > + u8 limit;
>> >> >
>> >> > for (n = i = 0; !n && amd_nb_bus_dev_ranges[i].dev_limit; ++i) {
>> >> > - u8 bus = amd_nb_bus_dev_ranges[i].bus;
>> >> > - u8 slot = amd_nb_bus_dev_ranges[i].dev_base;
>> >> > - u8 limit = amd_nb_bus_dev_ranges[i].dev_limit;
>> >> > + /* Try matching for the bus range */
>> >> > + if ((bus != amd_nb_bus_dev_ranges[i].bus) ||
>> >> > + (slot != amd_nb_bus_dev_ranges[i].dev_base))
>> >> > + continue;
>> >> > +
>> >> > + limit = amd_nb_bus_dev_ranges[i].dev_limit;
>> >> >
>> >> > + /* Setup all northbridges within the range */
>> >> > for (; slot < limit; ++slot) {
>> >> > u32 val = read_pci_config(bus, slot, 3, 0);
>> >> > -
>> >> > - if (!early_is_amd_nb(val))
>> >> > + if (!val)
>> >> > continue;
>> >> >
>> >> > val = read_pci_config(bus, slot, 3, 0x8c);
>> >> > @@ -375,13 +457,14 @@ static void __init pci_enable_pci_io_ecs(void)
>> >> > val |= ENABLE_CF8_EXT_CFG >> 32;
>> >>
>> >> What a fun shifting!
>> >>
>> >> Maybe you should do
>> >>
>> >> #define ENABLE_CF8_EXT_CFG BIT(46 - 32)
>> >>
>> >> to show exactly what you mean and how the bit is defined in MSR NB_CFG1
>> >> and also show how the high 32-bits are mapped into F3x8c, while at it.
>> >>
>> >> And then you can drop the shifting at the call site.
>> >
>> > Ok, I see another fun with this ECS enabling:
>> >
>> > There's a enable_pci_io_ecs() which enables ECS through the NB_CFG MSR
>> > which is called as part of the notifier *and* there's a PCI write to
>> > that same bit in pci_enable_pci_io_ecs() which iterates over all NBs.
>> >
>> > So, AFAICT, we do it twice and the second time is not needed. Which
>> > means, you probably can drop pci_enable_pci_io_ecs() completely and use
>> > solely the notifier?
>>
>> It does look as if there is some duplication with respect to setting
>> MSR_AMD64_NB_CFG's (which is aliased at D18F3x8c [1])
>> ENABLE_CF8_EXT_CFG enable bit but there are at least a couple of
>> differences.
>>
>> enable_pci_io_ecs() only sets the bit on one NB whereas
>> pci_enable_pci_io_ecs iterates over all the NBs (as you mentioned
>> above). The other difference has something to do with Xen; see the
>> origin of pci_enable_pci_io_ecs - commit 24d9b70b8.
>
> Of course it is xen - what else?! We do have to carry special code in
> baremetal just for it because it is special and we all can't seem to get
> enough of its crap.
>
> Oh well, I guess we should at least comment this and refer to 24d9b70b8
> so that the explanation is right there, in the code.

This is probably obvious, but my interest here is to (1) make sure all
systems in the field run well (so we need quirks to work around BIOS
and other issues), and (2) eliminate the need for kernel changes to
support future systems. So far we seem to be concentrating on (1) and
neglecting (2), which means we're always reacting to things that are
broken.

This I/O ECS thing seems likely to cause future problems. My
understanding (based on sec 2.8 of [1]) is that enable_pci_io_ecs()
and pci_enable_pci_io_ecs() are there to enable access to extended
config space (offsets 256-4095) via the 0xcf8/0xcfc I/O ports.

Per sec 4.1.1 of [2], we should be using ECAM (the memory-mapped
enhanced configuration mechanism, i.e., MMCONFIG) to access extended
config space, and the BIOS should supply an MCFG table.

So why do we need to enable I/O access to ECS on AMD chips at all? Is
this a workaround for a broken BIOS that doesn't supply an MCFG table?

>From reading the path below, I think raw_pci_read() will use
pci_direct_conf1 for (domain 0 [cfg 0-255]). For everything else, it
will use (a) pci_mmcfg if there's a valid MCFG or (b) pci_direct_conf1
if there's no MCFG and this is an AMD >= fam10h CPU, i.e.,
PCI_HAS_IO_ECS is set.

pci_arch_init
type = pci_direct_probe
pci_mmcfg_early_init
__pci_mmcfg_init
pci_mmcfg_arch_init
raw_pci_ext_ops = &pci_mmcfg
pci_direct_init
if (type == 1)
raw_pci_ops = &pci_direct_conf1
if (raw_pci_ext_ops)
return
if (!pci_probe & PCI_HAS_IO_ECS)
return
raw_pci_ext_ops = &pci_direct_conf1

I think we should try to get rid of amd_bus.c, e.g., only run
amd_postcore_init() for BIOS dates < 2015. It looks like a crutch
that is perpetuating buggy BIOSes and costing us maintenance effort.
We don't need anything similar for Intel CPUs, and I don't see a
compelling reason why we need it for AMD.

Bjorn

[1] BIOS and Kernel Developer's Guide for AMD Family 15h Models
00h-0Fh Processors Rev 3.14 (document number 42301)
[2] PCI Firmware Specification, Rev 3.0, June 20, 2005
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/