Re: [PATCH 4/6] ftrace, x86: make kernel text writable only forconversions

From: Mathieu Desnoyers
Date: Sun Feb 22 2009 - 23:54:19 EST


* Steven Rostedt (rostedt@xxxxxxxxxxx) wrote:
>
> On Sun, 22 Feb 2009, Mathieu Desnoyers wrote:
> > >
> > > We are changing over 19000 locations in the kernel. This touches almost
> > > all kernel text pages anyway. You want to map a page in and out for over
> > > 19000 locations?
> > >
> > > -- Steve
> > >
> >
> > Hi Steve,
> >
> > Can you provide numbers to indicate why it's required to be so intrusive
> > in the kernel mappings while doing these modifications ? I think opening
> > such time window where standard code mapping is writeable globally in
> > config RO_DATA kernels could open the door to unexpected side-effects,
> > so ideally going through the "backdoor" page mapped by text_poke seems
> > safer. Given similar performance, I would tend to use a text_poke-like
> > approach.
> >
>
> Not sure which numbers you are asking for. The dynamic function tracer
> modifies all mcount calls, which are done by practically every function in
> the kernel. With a normal Fedora kernel (and all its loaded modules),
> that's between 15 to 20 thousand functions, depending on what modules are
> loaded.
>

I mean comparing the cost of changing the kernel mappings and doing the
edits (as you do) vs doing it through a text-poke-like mapping.

> At boot up we convert them all to nops, but when we enable the function
> tracer, we convert them back to calls to the function tracer. This is done
> by a priviledge user, since the function tracer can add quite a bit of
> overhead when activated.
>
> I do not really see how changing this for the short period of time is any
> different than making another mapping point to the kernel code. If you
> could find a way to break this security, you should be able to break it
> with another mapping as well.

It's not only about breaking the security. It's mostly to insure
internal kernel consistency. While you are changing these mappings, you
could possibly have a window where other kernel code is running (irq,
other cpus threads). That code could itself be buggy and use the
writeable window to overwrite some kernel code or RO data. This
side-effect would go undetected while users think the RO data *is* is,
but it wouldn't. You also bring a good point about security : if someone
ever rely on CONFIG_DEBUG_RODATA for some security reasons, then we give
a big window where kernel text and ro data is writeable at *known*
addresses, while we can randomize the address used for text_poke.

>
> Also note that this dynamic tracing code works for not only x86, it also
> works for PPC, ARM, superH and ia64. To use text_poke, that would require
> all of these to have text_poke ported.
>

Do these architecture have DEBUG_RODATA config ? If not, then a simple
memcpy is ok.

> How does text_poke solve the issue of executing code on another CPU that
> is changing?
>

text_poke itself does not provide that. This must be insured by the
user on a case-by-case basis. For instance, kprobes is changing code
atomically _and_ just inserting/removing breakpoints. Doing this is fine
with cross-cpu code modification (XMC). alternative code is only
changing code when the CPU is UP, so it's also ok. However, changing
multi-bytes instructions without changing those for a trap-generating
instruction when the CPUs are up (SMP) falls into the erratas, and the
code that uses text_poke must carefully perform this modification, e.g.
by doing like I do in my immediate values patchset in the lttng git
tree (using a temporary breakpoint while doing the modification). This
would apply directly to the function tracer, and you could get rid of
this ugly latency-inducing stop_machine() call.

Mathieu

> -- Steve
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/