Re: [RFC kgr on klp 0/9] kGraft on the top of KLP

From: Josh Poimboeuf
Date: Tue May 12 2015 - 11:20:20 EST


On Tue, May 12, 2015 at 11:45:15AM +0200, Jiri Kosina wrote:
> On Tue, 5 May 2015, Josh Poimboeuf wrote:
>
> > > Agreed ... under the condition that it can be made really 100% reliable
> > > *and* we'd be reasonably sure that we will be able to realistically
> > > achieve the same goal on other architectures as well. Have you even
> > > started exploring that space, please?
> >
> > Yes. As I postulated before [1], there are two obstacles to achieving
> > reliable frame pointer stack traces: 1) missing frame pointer logic and
> > 2) exceptions. If either 1 or 2 was involved in the creation of any of
> > the frames on the stack, some frame pointers might be missing, and one
> > or more frames could be skipped by the stack walker.
> >
> > The first obstacle can be overcome and enforced at compile time using
> > stackvalidate [1].
> >
> > The second obstacle can be overcome at run time with a future RFC:
> > something like a save_stack_trace_tsk_validate() function which does
> > some validations while it walks the stack. It can return an error if it
> > detects an exception frame.
> >
> > (It can also do some sanity checks like ensuring that it walks all the
> > way to the bottom of the stack and that each frame has a valid ktext
> > address. I also would propose a CONFIG_DEBUG_VALIDATE_STACK option
> > which tries to validate the stack on every call to schedule.)
> >
> > Then we can have the hybrid consistency model rely on
> > save_stack_trace_tsk_validate(). If the stack is deemed unsafe, we can
> > fall back to retrying later, or to the kGraft mode of user mode barrier
> > patching.
> >
> > Eventually I want to try to make *all* stacks reliable, even those with
> > exception frames. That would involve compile and run time validations
> > of DWARF data, and ensuring that DWARF and frame pointers are consistent
> > with each other. But those are general improvements which aren't
> > prerequisites for the hybrid model.
> >
> > [1] http://lkml.kernel.org/r/cover.1430770553.git.jpoimboe@xxxxxxxxxx
>
> Yup, I understand what is the goal here (and don't get me wrong, I am of
> course all for making frame pointer based stack traces reliable). The
> question I had was -- your patchset is now very x86-centric. If we are
> going to proceed with the hybrid patching model, we'd need to be able to
> extend to other architectures as easily as possible.
>
> I currently haven't yet tried to explore how difficult would it be to
> extend your aproach to other archs. Have you?

Sorry, I missed that part of the question. Right now stackvalidate only
supports x86_64, but the framework is very generic. Support for other
architectures can be easily plugged in.

The same approach can be used for most architectures, including powerpc
and s390 (which both have back chain pointers which can be validated)
and arm64 (which requires CONFIG_FRAME_POINTER).

Those architectures which don't have frame/backchain pointers will still
have some live patching options:

- make DWARF stack data reliable
- only support the fallback mode of the hybrid model (syscall barrier
switching)
- only support the "immediate" consistency model (using the same code as
today)

--
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/