Re: [Xen-devel] NUMA_BALANCING and Xen PV guest regression in 3.20-rc0
From: David Vrabel
Date: Fri Feb 20 2015 - 05:28:34 EST
On 19/02/15 23:09, Linus Torvalds wrote:
> On Thu, Feb 19, 2015 at 5:06 AM, David Vrabel <david.vrabel@xxxxxxxxxx> wrote:
>> The NUMA_BALANCING series beginning with 5d833062139d (mm: numa: do not
>> dereference pmd outside of the lock during NUMA hinting fault) and
>> specifically 8a0516ed8b90 (mm: convert p[te|md]_numa users to
>> p[te|md]_protnone_numa) breaks Xen 64-bit PV guests.
>> Any fault on a present userspace mapping (e.g., a write to a read-only
>> mapping) is being misinterpreted as a NUMA hinting fault and not handled
>> correctly. All userspace programs end up continuously faulting.
>> This is because the hypervisor sets _PAGE_GLOBAL (== _PAGE_PROTNONE) on
>> all present userspace page table entries.
> That's some crazy stuff, but whatever. The patch is clearly good. Applied,
Xen PV guests do not use any hardware virtualization features. In
particular they do not use nested paging.
A 64-bit PV guest runs in user mode for both kernel and userspace. On
kernel to user mode transitions, the hypervisor flips between two sets
of page tables (the user mode tables do not contain any kernel mappings,
but the kernel mode tables contain both). By setting _PAGE_GLOBAL on
the userspace entries, a kernel to user transition can avoid flushing
the userspace mappings from the TLB.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/