Re: gdb switches to __sysvec_apic_timer_interrupt or __default_send_IPI_dest_field with KVM enabled

From: Vasyl Vavrychuk
Date: Mon Feb 07 2022 - 09:45:44 EST


Thanks a lot for these fixes which I can use, and for detailed explanation.

On Mon, Jan 31, 2022 at 12:42 PM Maxim Levitsky <mlevitsk@xxxxxxxxxx> wrote:
> I recently fixed that, and the code AFAIK is upstream, but probably, the qemu
> side of it didn't yet made it to the release.

You are right, I have observed some unrelated gdb issue when debugging
kernel under QEMU and prepared packaging backport:
https://salsa.debian.org/gdb-team/gdb/-/merge_requests/9

> I patched the lx-symbols script to at least work with recent gdb, but this no doubt relies on at least some undefined
> behavier in gdb, therefore I didn't push this futher.
>
> https://patchwork.kernel.org/project/kvm/patch/20210811122927.900604-5-mlevitsk@xxxxxxxxxx/

What a coincidence, I use lx-symbols with an external kernel module. I
have noticed that it behaves strangely sometimes, but somehow I found
a proper order of comments when it works for me.

On Mon, Jan 31, 2022 at 12:42 PM Maxim Levitsky <mlevitsk@xxxxxxxxxx> wrote:
>
> On Sat, 2022-01-29 at 23:06 +0200, Vasyl Vavrychuk wrote:
> > Hello,
> >
> > I run Linux kernel under qemu-system-x86_64 via the "-kernel" option.
> >
> > Also, I added the "-s" option to accept the gdb connection.
> >
> > After Linux boot up I connect with gdb and set a breakpoint in some
> > function, for example "device_del", does not matter really.
> >
> > The problem is if I also use "--enable-kvm", then after breakpoint
> > triggered and sending "n" from gdb, it switches to
> >
> > __sysvec_apic_timer_interrupt (regs=0xffffc90000297de8) at
> > arch/x86/kernel/apic/apic.c:1102
> > 1102 trace_local_timer_entry(LOCAL_TIMER_VECTOR);
> >
> > or to
> >
> > __default_send_IPI_dest_field (mask=<optimized out>,
> > vector=<optimized out>, dest=dest@entry=2048) at
> > arch/x86/kernel/apic/ipi.c:161
> > 161 cfg = __prepare_ICR2(mask);
> >
> > I am stepping over kernel code that does not perform any waiting or blocking.
> >
> > Everything works fine with "--enable-kvm" removed.
>
> I recently fixed that, and the code AFAIK is upstream, but probably, the qemu
> side of it didn't yet made it to the release.
>
> The problem you are seeing is that every time you single step, an interrupt
> occures because you are not as fast as computer is - timer interrupt happens
> like 1000 times in a second, so after each single step you do it will be pending.
>
> That makes GDB land you in the interrupt handler, which is correct
> technically but makes single stepping pretty much impossible.
>
> The solution is to tell kernel to mask interrupts regardless
> if they are masked by the guest, something that qemu even does when TCCG
> is used but was not implemented for KVM.
>
> Best regards,
> Maxim Levitsky
>
> PS: you might also want to patch kernel's lx-symbols gdb script to fix loadable module support,
> which currently doesn't work well - I run out of time to upstream it, I'll get to it
> someday.
>
> There problem here is that kernel's gdb script uses a breakpoint in the function that
> loads modules and when it hits, it reloads gdb symbols - that is frowned upon in gdb docs,
> but pretty much the only way to do it.
>
> I patched the lx-symbols script to at least work with recent gdb, but this no doubt relies on at least some undefined
> behavier in gdb, therefore I didn't push this futher.
>
> https://patchwork.kernel.org/project/kvm/patch/20210811122927.900604-5-mlevitsk@xxxxxxxxxx/
>
>
>
> >
> > Thanks,
> > Vasyl
> >
>
>