Re: [PATCH] x86/mm: Do not split_large_page() for set_kernel_text_rw()

From: Steven Rostedt
Date: Mon Aug 26 2019 - 07:33:14 EST


On Fri, 23 Aug 2019 11:36:37 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Thu, Aug 22, 2019 at 10:23:35PM -0700, Song Liu wrote:
> > As 4k pages check was removed from cpa [1], set_kernel_text_rw() leads to
> > split_large_page() for all kernel text pages. This means a single kprobe
> > will put all kernel text in 4k pages:
> >
> > root@ ~# grep ffff81000000- /sys/kernel/debug/page_tables/kernel
> > 0xffffffff81000000-0xffffffff82400000 20M ro PSE x pmd
> >
> > root@ ~# echo ONE_KPROBE >> /sys/kernel/debug/tracing/kprobe_events
> > root@ ~# echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
> >
> > root@ ~# grep ffff81000000- /sys/kernel/debug/page_tables/kernel
> > 0xffffffff81000000-0xffffffff82400000 20M ro x pte
> >
> > To fix this issue, introduce CPA_FLIP_TEXT_RW to bypass "Text RO" check
> > in static_protections().
> >
> > Two helper functions set_text_rw() and set_text_ro() are added to flip
> > _PAGE_RW bit for kernel text.
> >
> > [1] commit 585948f4f695 ("x86/mm/cpa: Avoid the 4k pages check completely")
>
> ARGH; so this is because ftrace flips the whole kernel range to RW and
> back for giggles? I'm thinking _that_ is a bug, it's a clear W^X
> violation.

Since ftrace did this way before text_poke existed and way before
anybody cared (back in 2007), it's not really a bug.

Anyway, I believe Nadav has some patches that converts ftrace to use
the shadow page modification trick somewhere.

Or we also need the text_poke batch processing (did that get upstream?).

Mapping in 40,000 pages one at a time is noticeable from a human stand
point.

-- Steve