There appears to be no formal statement of what pv_irq_ops.save_fl() is
supposed to return precisely. Native returns the full flags, while lguest and
Xen only return the Interrupt Flag, and both have comments by the
implementations stating that only the Interrupt Flag is looked at. This may
have been true when initially implemented, but no longer is.
To make matters worse, the Xen PVOP leaves the upper bits undefined, making
the BUG_ON() undefined behaviour. Experimentally, this now trips for 32bit PV
guests on Broadwell hardware. The BUG_ON() is consistent for an individual
build, but not consistent for all builds. It has also been a sitting timebomb
since SMAP support was introduced.
Use native_save_fl() instead, which will obtain an accurate view of the AC
flag.
Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CC: Ingo Molnar <mingo@xxxxxxxxxx>
CC: H. Peter Anvin <hpa@xxxxxxxxx>
CC: x86@xxxxxxxxxx
CC: linux-kernel@xxxxxxxxxxxxxxx
CC: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
CC: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
CC: David Vrabel <david.vrabel@xxxxxxxxxx>
CC: xen-devel <xen-devel@xxxxxxxxxxxxx>
CC: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
CC: lguest@xxxxxxxxxxxxxxxx
---
This patch is RFC because I am not certain that native_save_fl() is
necessarily the correct solution on lguest, but it does seem that setup_smap()
wants to check the actual AC bit, rather than an idealised value.
A different approach, given the dual nature of the AC flag now is to gate
setup_smap() on a kernel rpl of 0. SMAP necessarily can't be used in a
paravirtual situation where the kernel runs in cpl > 0.
Another different approach would be to formally state that
pv_irq_ops.save_fl() needs to return all the flags, which would make
local_irq_save() safe to use in this circumstance, but that makes a hotpath
longer for the sake of a single boot time check.