Re: 2.6.29 pat issue
From: Thomas Hellström
Date: Fri Feb 06 2009 - 04:43:36 EST
Eric W. Biederman wrote:
Thomas Hellstrom <thellstrom@xxxxxxxxxx> writes:
Indeed, it's crucial to keep the mappings consistent, but failure to do so is a
kernel driver bug, it should never be the result of invalid user data.
It easily can be. Think of an X server mmaping frame buffers. Or other
device bars.
Hmm, Yes you're right, although I'm still a bit doubtful about RAM pages.
Wait. Now I see what's causing the problems. The code is assuming that
VM_PFNMAP vmas never map RAM pages. That's also an invalid assumption.
See comments in mm/memory.c
So probably the attribute check should be done for the insert_pfn path
of VM_MIXEDMAP as well. That's not done today.
So there are three distinct bugs at this point:
1) VMAs with VM_PFNMAP are incorrectly assumed to be linear if
vma->vm_pgoff non-null.
2) VM_PFNMAP VMA PTEs are incorrectly assumed to never point to physical
RAM.
3) There is no check for the insert_pfn path of vm_insert_mixed().
IMHO checking each vm_insert_pfn() for caching attribute correctness is not
something that should be enabled by default, due to the CPU overhead. Production
drivers should never violate this.
If it is a problem the implementation should become more efficient. Userspace
as well as drivers can generate these mappings so even with a perfect driver
you cannot guarantee that someone else does not have that area of memory
mapped differently.
OK, So there seems to be a couple of things that can be done for
performance here:
1) A fastpath for single pages.
2) RAM pages are tracked with a page bit today.
Why not say "all memory backed by a struct page" should be tracked with
a page bit. Then pfn_valid() could be used instead of page_is_ram().
This, combined with 1) should make tracking struct page backed pages
extremely fast.
3) If vm_insert_pfn() happens to be used on a linear VMA, it looks like
the whole VMA is being validated for each vm_insert_pfn(), which seems
extremely inefficient, considering the extensive tests in pagerame_is_ram().
/Thomas
Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/