[PATCH 0/1] Use proper mask when setting PUD mapping

From: Aaron Lu
Date: Thu Aug 18 2022 - 22:29:09 EST


Hello,

While addressing community's review comments about how PTI treats Global
bit in kernel page table, I noticed most part of the direct mapping in
the kernel page table has the Global bit set, while I think the intent
is to unset Global bit when PTI is on for the majority mappings.

e.g. on my desktop with i7-7700K/32G memory, the following part has
Global bit set according to /sys/kernel/debug/page_tables/kernel:
---[ Low Kernel Mapping ]---
... ...
0xffff888140000000-0xffff888840000000 28G RW PSE GLB NX pud
... ...

There are also many other parts in the direct mapping that have Global
bit set, I just showed the biggest part.

Leaving these entries with Global bit set doesn't look good, as Global
means when switching from kernel to user mode, this entry, if has
been used in kernel mode, can remain in TLB and user can potentially
make use of this entry to speculatively access memory that they shouldn't
access, reducing the effect of clearing kernel part in user page tables.

I tracked this issue to commit c164fbb40c43f("x86/mm: thread
pgprot_t through init_memory_mapping()") where it used
__PAGE_KERNEL_LARGE instead of PAGE_KERNEL_LARGE for PUD mapping.
__PAGE_KERNEL_LARGE is a raw value, it doesn't get the mask of
__default_kernel_pte_mask, hence leaving Global bit on these PUD
mappings(and mappings that are split from these PUD mappings).

While to see if this left Global bit can really make bad things happen,
I wrote a test code based on secret.c from:
https://github.com/IAIK/meltdown.
In less than 10 tries, the physical_reader program can recover the
secret strings set in the secret2.c program so I think this issue should
be fixed. I run the two programs like this:
# start secret2 from one terminal, bind it to cpu2
$ sudo taskset -c 2 ./secret2
[+] Secret: Burn after reading this string, it is a secret stringce and kernel
[+] Physical address of secret: 0x120121000
[+] Exit with Ctrl+C if you are done reading the secret
# in another terminal, run physical_reader. make sure to bind to the
# same cpu so that it can see the TLB for the page where secret string
# resides:
$ sudo taskset -c 2 ./physical_reader 0x120121000 0xffff888000000000
[+] Physical address : 0x120121000
[+] Physical offset : 0xffff888000000000
[+] Reading virtual address: 0xffff888120121000
)
Burn after reading this string, it is a secret stringce and kern^C

This patch fixes this problem by using proper mask when setting PUD
mappings.

Aaron Lu (1):
x86/mm: Use proper mask when setting PUD mapping

arch/x86/mm/init_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--
2.37.1

#include "libkdump.h"
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>

const char *strings[] = {
"If you can read this, this is really bad",
"Burn after reading this string, it is a secret string",
"Congratulations, you just spied on an application",
"Wow, you broke the security boundary between user space and kernel",
"Welcome to the wonderful world of microarchitectural attacks",
"Please wait while we steal your secrets...",
"Don't panic... But your CPU is broken and your data is not safe",
"How can you read this? You should not read this!"};

int main(int argc, char *argv[]) {
libkdump_config_t config;
int fd;
char *p, *q;
config = libkdump_get_autoconfig();
libkdump_init(config);

srand(time(NULL));
const char *secret = strings[rand() % (sizeof(strings) / sizeof(strings[0]))];
int len = strlen(secret);

fd = open("/home/aaron/test", O_RDWR);
if (fd == -1) {
perror("open");
return -1;
}
ftruncate(fd, 0x1000);

p = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
close(fd);
return -1;
}
memcpy(p, secret, len);

printf("\x1b[32;1m[+]\x1b[0m Secret: \x1b[33;1m%s\x1b[0m\n", p);

size_t paddr = libkdump_virt_to_phys((size_t)p);
if (!paddr) {
printf("\x1b[31;1m[!]\x1b[0m Program requires root privileges (or read access to /proc/<pid>/pagemap)!\n");
libkdump_cleanup();
exit(1);
}

printf("\x1b[32;1m[+]\x1b[0m Physical address of secret: \x1b[32;1m0x%zx\x1b[0m\n", paddr);
printf("\x1b[32;1m[+]\x1b[0m Exit with \x1b[37;1mCtrl+C\x1b[0m if you are done reading the secret\n");
while (1) {
// keeps doing write(2) such that the direct map will stay in TLB
write(fd, secret, len);
sched_yield();
lseek(fd, 0, SEEK_SET);
}

libkdump_cleanup();

return 0;
}