Re: [RFC 2/4] PCI: generic: Add support for ARM64 and MSI(x)

From: Arnd Bergmann
Date: Tue Oct 07 2014 - 09:53:16 EST


On Tuesday 07 October 2014 13:06:59 Lorenzo Pieralisi wrote:
> On Wed, Oct 01, 2014 at 10:38:45AM +0100, Arnd Bergmann wrote:
>
> [...]
>
> > pci_mmap_page_range could either get generalized some more in an attempt
> > to have a __weak default implementation that works on ARM, or it could
> > be changed to lose the dependency on pci_sys_data instead. In either
> > case, the change would involve using the generic pci_host_bridge_window
> > list.
>
> On ARM pci_mmap_page_range requires pci_sys_data to retrieve its
> mem_offset parameter. I had a look, and I do not understand *why*
> it is required in that function, so I am asking. That function
> is basically used to map PCI resources to userspace, IIUC, through
> /proc or /sysfs file mappings. As far as I understand those mappings
> expect VMA pgoff to be the CPU address when files representing resources
> are mmapped from /proc and 0 when mmapped from /sys (I mean from
> userspace, then VMA pgoff should be updated by the kernel to map the
> resource).

Applying the mem_offset is certainly the more intuitive way, since
that lets you read the PCI BAR values from a device and access the
device with the appropriate offsets.

> Question is: why pci_mmap_page_range() should apply an additional
> shift to the VMA pgoff based on pci_sys_data.mem_offset, which represents
> the offset from cpu->bus offset. I do not understand that. PowerPC
> does not seem to apply that fix-up (in PowerPC __pci_mmap_make_offset there
> is commented out code which prevents the pci_mem_offset shift to be
> applied). I think it all boils down to what the userspace interface is
> expecting when the memory areas are mmapped, if anyone has comments on
> this that is appreciated.

The important part is certainly that whatever transformation is done
by pci_resource_to_user() gets undone by __pci_mmap_make_offset().

In case of PowerPC and Microblaze, the mem_offset handling is commented
out in both, to work around X11 trying to use the same values on
/dev/mem. However, they do have the respective fixup for io_offset.

sparc applies the offset in both places for both io_offset and mem_offset.
xtensa applies only io_offset in __pci_mmap_make_offset but neither
in pci_resource_to_user. This probably works because the mem_offset is
always zero there.
mips applies a different fixup (for 36-bit addressing), but not the
mem_offset.

Every other architecture applies no offset here, neither in __pci_mmap_make_offset/pci_mmap_page_range nor in pci_resource_to_user

The only hint I could find for how the ARM version came to be is
from the historic kernel tree git log for linux-2.5.42, which added
the current code as

2002/10/13 11:05:47+01:00 rmk
[ARM] Update pcibios_enable_device, supply pci_mmap_page_range()
Update pcibios_enable_device to only enable requested resources,
mainly for IDE. Supply a pci_mmap_page_range() function to allow
user space to mmap PCI regions.

At that point, only two platforms had a nonzero mem_offset:
footbridge/dc21285 and integrator/pci_v3. Both were using VGA,
and presumably used this to make X work. (rmk might remember
details).

The code at the time matched what powerpc and sparc did, but then
both implemented pci_resource_to_user() in order for libpciaccess
to work correctly (bcea1db16b for sparc, 463ce0e103f for powerpc),
and later powerpc changed it again to not apply the offset in
pci_resource_to_user or pci_mmap_page_range in 396a1a5832ae.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/