Re: WARNING: can't access registers at asm_common_interrupt

From: Peter Zijlstra
Date: Wed Nov 11 2020 - 12:47:49 EST


On Wed, Nov 11, 2020 at 11:05:36AM -0600, Josh Poimboeuf wrote:
> On Fri, Nov 06, 2020 at 06:04:15AM +0000, Shinichiro Kawasaki wrote:
> > Greetings,
> >
> > I observe "WARNING: can't access registers at asm_common_interrupt+0x1e/0x40"
> > in my kernel test system repeatedly, which is printed by unwind_next_frame() in
> > "arch/x86/kernel/unwind_orc.c". Syzbot already reported that [1]. Similar
> > warning was reported and discussed [2], but I suppose the cause is not yet
> > clarified.
> >
> > The warning was observed with v5.10-rc2 and older tags. I bisected and found
> > that the commit 044d0d6de9f5 ("lockdep: Only trace IRQ edges") in v5.9-rc3
> > triggered the warning. Reverting that from 5.10-rc2, the warning disappeared.
> > May I ask comment by expertise on CC how this commit can relate to the warning?
> >
> > The test condition to reproduce the warning is rather unique (blktests,
> > dm-linear and ZNS device emulation by QEMU). If any action is suggested for
> > further analysis, I'm willing to take it with my test system.
> >
> > Wish this report helps.
> >
> > [1] https://lkml.org/lkml/2020/9/6/231
> > [2] https://lkml.org/lkml/2020/9/8/1538
>
> Shin'ichiro,
>
> Thanks for all the data. It looks like the ORC unwinder is getting
> confused by paravirt patching (with runtime-patched pushf/pop changing
> the stack layout).
>
> <user interrupt>
> exit_to_user_mode_prepare()
> exit_to_user_mode_loop()
> local_irq_disable_exit_to_user()
> local_irq_disable()
> raw_irqs_disabled()
> arch_irqs_disabled()
> arch_local_save_flags()
> pushfq
> <another interrupt>

This is PARAVIRT_XXL only, which is a Xen special. My preference, as
always, is to kill it... Sadly the Xen people have a different opinion.

> Objtool doesn't know about the pushf/pop paravirt patch, so ORC gets
> confused by the changed stack layout.
>
> I'm thinking we either need to teach objtool how to deal with
> save_fl/restore_fl patches, or we need to just get rid of those nasty
> patches somehow. Peter, any thoughts?

Don't use Xen? ;-)

So with PARAVIRT_XXL the compiler will emit something like:

"CALL *pvops.save_fl"

Which we then overwrite at runtime with "pushf; pop %[re]ax" and a few
NOPs.

Now, objtool understands alternatives, and ensures they have the same
stack layout, it has no chance in hell of understanding this, simply
because paravirt_patch.c is magic.

I don't have any immediate clever ideas, but let me ponder it a wee bit.

....

Something really disguisting we could do is recognise the indirect call
offset and emit an extra ORC entry for RIP+1. So the cases are:

CALL *pv_ops.save_fl -- 7 bytes IIRC
CALL $imm; -- 5 bytes
PUSHF; POP %[RE]AX -- 2 bytes

so the RIP+1 (the POP insn) will only ever exist in this case. The
indirect and direct call cases would never land on that IP.

....


> It looks like 044d0d6de9f5 ("lockdep: Only trace IRQ edges") is making
> the problem more likely, by adding the irqs_disabled() check for every
> local_irq_disable().
>
> Also - Peter, Nicholas - is that irqs_disabled() check really necessary
> in local_irq_disable()? Presumably irqs would typically be be enabled
> before calling it?

Yeah, so it's all a giant can of worms that; also see:

https://lkml.kernel.org/r/20200821084738.508092956@xxxxxxxxxxxxx

The basic idea is to only trace edges, ie. when the hardware state
actually changes. Sadly this means doing a pushf/pop before the cli.
Ideally CLI would store the old IF in CF or something like that, but
alas.