Re: [PATCH] x86/mm: Do not verify W^X at boot up
From: Steven Rostedt
Date: Mon Oct 24 2022 - 17:22:53 EST
On Mon, 24 Oct 2022 09:14:45 -0700
Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> On 10/24/22 08:45, Steven Rostedt wrote:
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -587,6 +587,10 @@ static inline pgprot_t verify_rwx(pgprot_t old, pgprot_t new, unsigned long star
> > {
> > unsigned long end;
> >
> > + /* Kernel text is rw at boot up */
> > + if (system_state == SYSTEM_BOOTING)
> > + return new;
>
> Hi Steven,
>
> Thanks for the report and the patch. That seems reasonable, but I'm a
> bit worried that it opens up a big hole (boot time) when a W+X mapping
> could be created *anywhere*.
>
> Could we restrict this bypass to *only* kernel text addresses during
> boot? Maybe something like this:
>
> if ((system_state == SYSTEM_BOOTING) &&
> __kernel_text_address(start))
> return new;
>
> That would be safe because we know that kernel_text_address() addresses
> will be made read-only by the time userspace shows up and that
> is_kernel_inittext() addresses will be freed.
>
> Long-term, I wonder if we could teach the early patching code that it
> can't just use memcpy().
>
This is hacky, and I agree with Linus, that ideally, we can get text_poke()
working better and remove all theses "special cases", but in case we
struggle to do so, the below patch also works.
It only looks at the one case that ftrace is setting up its trampoline at
early boot up.
-- Steve
diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index 908d99b127d3..41b3ecd23a08 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -25,6 +25,8 @@
#ifndef __ASSEMBLY__
extern void __fentry__(void);
+extern long ftrace_updated_trampoline;
+
static inline unsigned long ftrace_call_adjust(unsigned long addr)
{
/*
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index bd165004776d..e2a1fc7bbe7a 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -417,7 +417,11 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
if (likely(system_state != SYSTEM_BOOTING))
set_memory_ro((unsigned long)trampoline, npages);
+ else
+ ftrace_updated_trampoline = trampoline;
set_memory_x((unsigned long)trampoline, npages);
+ ftrace_updated_trampoline = 0;
+
return (unsigned long)trampoline;
fail:
tramp_free(trampoline);
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 97342c42dda8..3fd3a45cafe8 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -32,6 +32,7 @@
#include <asm/memtype.h>
#include <asm/hyperv-tlfs.h>
#include <asm/mshyperv.h>
+#include <asm/ftrace.h>
#include "../mm_internal.h"
@@ -579,6 +580,8 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long start,
return __pgprot(pgprot_val(prot) & ~forbidden);
}
+long ftrace_updated_trampoline;
+
/*
* Validate strict W^X semantics.
*/
@@ -587,6 +590,10 @@ static inline pgprot_t verify_rwx(pgprot_t old, pgprot_t new, unsigned long star
{
unsigned long end;
+ /* Kernel text is rw at boot up */
+ if (system_state == SYSTEM_BOOTING && ftrace_updated_trampoline == start)
+ return new;
+
/*
* 32-bit has some unfixable W+X issues, like EFI code
* and writeable data being in the same page. Disable