Re: [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking

From: Andy Lutomirski
Date: Fri Apr 29 2016 - 20:09:17 EST


On Apr 29, 2016 3:41 PM, "Josh Poimboeuf" <jpoimboe@xxxxxxxxxx> wrote:
>
> On Fri, Apr 29, 2016 at 02:37:41PM -0700, Andy Lutomirski wrote:
> > On Fri, Apr 29, 2016 at 2:25 PM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> > > I think the easiest way to make it work would be to modify the idtentry
> > > macro to put all the idt entries in a dedicated section. Then the
> > > unwinder could easily detect any calls from that code.
> >
> > That would work. Would it make sense to do the same for the irq entries?
>
> Yes, I think so.
>
> > >> I suppose we could try to rejigger the code so that rbp points to
> > >> pt_regs or similar.
> > >
> > > I think we should avoid doing something like that because it would break
> > > gdb and all the other unwinders who don't know about it.
> >
> > How so?
> >
> > Currently, rbp in the entry code is meaningless. I'm suggesting that,
> > when we do, for example, 'call \do_sym' in idtentry, we point rbp to
> > the pt_regs. Currently it points to something stale (which the
> > dump_stack code might be relying on. Hmm.) But it's probably also
> > safe to assume that if you unwind to the 'call \do_sym', then pt_regs
> > is the next thing on the stack, so just doing the section thing would
> > work.
>
> Yes, rbp is meaningless on the entry from user space. But if an
> in-kernel interrupt occurs (e.g. page fault, preemption) and you have
> nested entry, rbp keeps its old value, right? So the unwinder can walk
> past the nested entry frame and keep going until it gets to the original
> entry.

Yes.

It would be nice if we could do better, though, and actually notice
the pt_regs and identify the entry. For example, I'd love to see
"page fault, RIP=xyz" printed in the middle of a stack dump on a
crash. Also, I think that just following rbp links will lose the
actual function that took the page fault (or whatever function
pt_regs->ip actually points to).

>
> > We should really re-add DWARF some day.
>
> Working on it :-)

Excellent.

Have you looked at my vdso unwinding test at all? If we could do
something similar for the kernel, IMO it would make testing much more
pleasant.

--Andy