On 09/28/2015 01:49 PM, Martin Schwidefsky wrote:
On Thu, 24 Sep 2015 17:05:48 +0200
Vlastimil Babka <vbabka@xxxxxxx> wrote:
[...]
However, __get_user_pages_fast() is still broken. The get_user_pages_fast()
wrapper will hide this in the common case. The other user of the __ variant
is kvm, which is mentioned as the reason for removal of emulated hugepages.
The call of page_cache_get_speculative() looks also broken in this scenario
on debug builds because of VM_BUG_ON_PAGE(PageTail(page), page). With
CONFIG_TINY_RCU enabled, there's plain atomic_inc(&page->_count) which also
probably shouldn't happen for a tail page...
It boils down to __get_user_pages_fast being broken for emulated large pages,
doesn't it? My preferred fix would be to get __get_user_page_fast to work
in this case.
I agree, but didn't know enough of the architecture to attempt such fix
:) Thanks!
For 3.12 a patch would look like this (needs more testing
though):
FWIW it works for me in the particular LTP test, but as you said, it
needs more testing and breaking stable would suck.
@@ -103,7 +104,7 @@ static inline int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
unsigned long end, int write, struct page **pages, int *nr)
{
unsigned long next;
- pmd_t *pmdp, pmd;
+ pmd_t *pmdp, pmd, pmd_orig;
pmdp = (pmd_t *) pudp;
#ifdef CONFIG_64BIT
@@ -112,7 +113,7 @@ static inline int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
pmdp += pmd_index(addr);
#endif
do {
- pmd = *pmdp;
+ pmd = pmd_orig = *pmdp;
barrier();
next = pmd_addr_end(addr, end);
/*
@@ -127,8 +128,9 @@ static inline int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr,
if (pmd_none(pmd) || pmd_trans_splitting(pmd))
return 0;
if (unlikely(pmd_large(pmd))) {
- if (!gup_huge_pmd(pmdp, pmd, addr, next,
- write, pages, nr))
+ if (!gup_huge_pmd(pmdp, pmd_orig,
+ pmd_swlarge_deref(pmd),
+ addr, next, write, pages, nr))
return 0;
} else if (!gup_pte_range(pmdp, pmd, addr, next,
write, pages, nr))