Re: Map PCI resource to user space problem.

From: Armin Schindler
Date: Tue Jun 21 2011 - 09:20:21 EST


On Tue, 21 Jun 2011, Matthew Wilcox wrote:
On Mon, Jun 20, 2011 at 10:41:38AM -0600, Bjorn Helgaas wrote:
On Mon, Jun 20, 2011 at 10:38 AM, Armin Schindler <armin@xxxxxxxxxx> wrote:
On Mon, 20 Jun 2011, Armin Schindler wrote:

Hello,

when I try to map PCI resource of size < PAGE_SIZE to user space
with e.g. remap_pfn_range() the mapping seems to be created,
but the access to the PCI device just doesn't work (read returns 0xff).

The same happens when using sysfs. The device has
# ls -al /sys/bus/pci/devices/0000\:02\:09.0/resource*
-rw------- 1 root root ? ? 512 Jun 20 11:33
/sys/bus/pci/devices/0000:02:09.0/resource0
-rw------- 1 root root ? ? 256 Jun 20 11:33
/sys/bus/pci/devices/0000:02:09.0/resource1
-rw------- 1 root root 8388608 Jun 20 11:33
/sys/bus/pci/devices/0000:02:09.0/resource2
-rw------- 1 root root ?262144 Jun 20 11:33
/sys/bus/pci/devices/0000:02:09.0/resource3

and when I mmap() resource0, a valid address is returned, but reading the
area always returns 0xff.

Since a kernel driver can access resource0 here without problems when
using
ioremap(), I'm not sure what is wrong with the sysfs entry for user space
(or using remap_pfn_range() in own mmap function).
The only hint I have so far is the size, which is smaller than PAGE_SIZE.

I found the reason. It is not the size of the resource.
The resource0 is assigned to a not page-aligned address.
Example, the pci resource0 has address 0xfe5ffc00. mmap()ing it will
actually seems to map address 0xfe5ff000 and the user needs to add 0xc00 to
the address returned by mmap().

Oh, ugh. That's enforced outside the PCI layer.
For example (from sys_x86_64.c):

SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
unsigned long, prot, unsigned long, flags,
unsigned long, fd, unsigned long, off)
{
long error;
error = -EINVAL;
if (off & ~PAGE_MASK)
goto out;

ie the very first check that the _arch specific_ code does is to check
the offset is page-aligned.

The offset given to mmap() is not the problem. When you want to mmap
a PCI memory BAR via sysfs you normaly use offset=0 to get the BAR from beginning. The problem is the base-address (physical address of that BAR
assigned by PCI), which is not page-aligned.

We can't fix this. There may be other BARs mapped into the same page,
so allowing the user to mmap this BAR would give them access to those
other BARs that they may not be entitled to access.

If a page is 'reused' like that and remap functions for user
are page aligned only, then yes.
Besides the access problem to other BARs, the user doesn't get a correct
address from mmap().

Armin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/