[PATCH 3.16 095/131] x86/speculation/l1tf: Protect swap entries against L1TF

From: Ben Hutchings
Date: Sat Sep 29 2018 - 17:58:00 EST


3.16.59-rc1 review patch. If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>

commit 2f22b4cd45b67b3496f4aa4c7180a1271c6452f6 upstream.

With L1 terminal fault the CPU speculates into unmapped PTEs, and resulting
side effects allow to read the memory the PTE is pointing too, if its
values are still in the L1 cache.

For swapped out pages Linux uses unmapped PTEs and stores a swap entry into
them.

To protect against L1TF it must be ensured that the swap entry is not
pointing to valid memory, which requires setting higher bits (between bit
36 and bit 45) that are inside the CPUs physical address space, but outside
any real memory.

To do this invert the offset to make sure the higher bits are always set,
as long as the swap file is not too big.

Note there is no workaround for 32bit !PAE, or on systems which have more
than MAX_PA/2 worth of memory. The later case is very unlikely to happen on
real systems.

[AK: updated description and minor tweaks by. Split out from the original
patch ]

Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Tested-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Reviewed-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Acked-by: Michal Hocko <mhocko@xxxxxxxx>
Acked-by: Vlastimil Babka <vbabka@xxxxxxx>
Acked-by: Dave Hansen <dave.hansen@xxxxxxxxx>
[bwh: Backported to 3.16: Bit 9 may be reserved for PAGE_BIT_NUMA here]
Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
---
arch/x86/include/asm/pgtable_64.h | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -167,7 +167,7 @@ static inline int pgd_large(pgd_t pgd) {
*
* | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number
* | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names
- * | TYPE (59-63) | OFFSET (10-58) | 0 |0|0|X|X| X| X|X|SD|0| <- swp entry
+ * | TYPE (59-63) | ~OFFSET (10-58) | 0 |0|0|X|X| X| X|X|SD|0| <- swp entry
*
* G (8) is aliased and used as a PROT_NONE indicator for
* !present ptes. We need to start storing swap entries above
@@ -180,6 +180,9 @@ static inline int pgd_large(pgd_t pgd) {
*
* Bit 7 in swp entry should be 0 because pmd_present checks not only P,
* but also L and G.
+ *
+ * The offset is inverted by a binary not operation to make the high
+ * physical bits set.
*/
#define SWP_TYPE_BITS 5

@@ -199,13 +202,15 @@ static inline int pgd_large(pgd_t pgd) {
#define __swp_type(x) ((x).val >> (64 - SWP_TYPE_BITS))

/* Shift up (to get rid of type), then down to get value */
-#define __swp_offset(x) ((x).val << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT)
+#define __swp_offset(x) (~(x).val << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT)

/*
* Shift the offset up "too far" by TYPE bits, then down again
+ * The offset is inverted by a binary not operation to make the high
+ * physical bits set.
*/
#define __swp_entry(type, offset) ((swp_entry_t) { \
- ((unsigned long)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \
+ (~(unsigned long)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \
| ((unsigned long)(type) << (64-SWP_TYPE_BITS)) })

#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) })