Re: [PATCH v3 0/2] arm64: Fix pending single-step debugging issues
From: Sumit Garg
Date: Mon Jul 11 2022 - 08:44:48 EST
On Sat, 2 Jul 2022 at 03:44, Doug Anderson <dianders@xxxxxxxxxxxx> wrote:
> On Tue, May 10, 2022 at 11:05 PM Sumit Garg <sumit.garg@xxxxxxxxxx> wrote:
> > This patch-set reworks pending fixes from Wei's series  to make
> > single-step debugging via kgdb/kdb on arm64 work as expected. There was
> > a prior discussion on ML  regarding if we should keep the interrupts
> > enabled during single-stepping. So patch #1 follows suggestion from Will
> >  to not disable interrupts during single stepping but rather skip
> > single stepping within interrupt handler.
> >  https://lore.kernel.org/all/20200509214159.19680-1-liwei391@xxxxxxxxxx/
> >  https://lore.kernel.org/all/CAD=FV=Voyfq3Qz0T3RY+aYWYJ0utdH=P_AweB=13rcV8GDBeyQ@xxxxxxxxxxxxxx/
> >  https://lore.kernel.org/all/20200626095551.GA9312@willie-the-truck/
> > Changes in v3:
> > - Reword commit descriptions as per Daniel's suggestions.
> > Changes in v2:
> > - Replace patch #1 to rather follow Will's suggestion.
> > Sumit Garg (2):
> > arm64: entry: Skip single stepping into interrupt handlers
> > arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step
> > arch/arm64/include/asm/debug-monitors.h | 1 +
> > arch/arm64/kernel/debug-monitors.c | 5 +++++
> > arch/arm64/kernel/entry-common.c | 18 +++++++++++++++++-
> > arch/arm64/kernel/kgdb.c | 2 ++
> > 4 files changed, 25 insertions(+), 1 deletion(-)
> Sorry it took so long for me to respond. I kept dreaming that I'd find
> the time to really dig deep into this to understand it fully and I'm
> finally giving up on it.
No worries and apologies on my part as well as I had to find some time
to reproduce the issue that you have reported below.
> I'm going to hope that Will and/or Catalin
> knows this area of the code well and can give it a good review. If not
> then I'll strive harder to make the time...
> In any case, I poked around with this a bunch and it definitely
> improved the stepping behavior a whole lot. I still got one case where
> gdb hit an assertion while I was stepping, but I could believe that
> was a problem with gdb? I couldn't reproduce it. Thus I can at least
> Tested-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
Thanks for the testing.
> I'll also note that I _think_ I remember that with Wei's series that
> the gdb function "call" started working. I tried that here and it
> didn't seem so happy. To keep things simple, I created a dummy
> function in my kernel that looked like:
> void doug_test(void)
> pr_info("testing, 1 2 3\n");
> I broke into the debugger by echoing "g" to /proc/sysrq-trigger and
> then tried "call doug_test()". I guess my printout actually printed
> but it wasn't so happy after that. Seems like it somehow ended up
> returning to a bogus address after the call which then caused a crash.
I am able to reproduce this issue on my setup as well. But it doesn't
seem to be a regression caused by this patch-set over Wei's series. As
I could reproduce this issue with v1  patch-set as well which was
just a forward port of pending patches from Wei's series to the latest
Maybe it's a different regression caused by other changes? BTW, do you
remember the kernel version you tested with Wei's series applied?
> testing, 1 2 3
> BUG: sleeping function called from invalid context at
> in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 3393, name: bash
> preempt_count: 0, expected: 0
> RCU nest depth: 1, expected: 0
> CPU: 6 PID: 3393 Comm: bash Not tainted 5.19.0-rc4+ #3
> Hardware name: Google Herobrine (rev1+) (DT)
> Call trace:
> Unable to handle kernel execute from non-executable memory at
> virtual address ffffffc008000000
> Mem abort info:
> ESR = 0x000000008600000f
> EC = 0x21: IABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> FSC = 0x0f: level 3 permission fault
> swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000082863000
> [ffffffc008000000] pgd=100000027ffff003, p4d=100000027ffff003,
> pud=100000027ffff003, pmd=100000027fffe003, pte=00680001001c3703
> Internal error: Oops: 8600000f [#1] PREEMPT SMP
> I'm not sure if that's a sign that something is missing with your patch or not.