Re: [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of

From: Mark Rutland
Date: Fri Sep 24 2021 - 13:36:31 EST


[Adding Paul for RCU, s390 folk for entry code RCU semantics]

On Fri, Sep 24, 2021 at 09:28:32PM +0800, Pingfan Liu wrote:
> After introducing arm64/kernel/entry_common.c which is akin to
> kernel/entry/common.c , the housekeeping of rcu/trace are done twice as
> the following:
> enter_from_kernel_mode()->rcu_irq_enter().
> And
> gic_handle_irq()->...->handle_domain_irq()->irq_enter()->rcu_irq_enter()
>
> Besides redundance, based on code analysis, the redundance also raise
> some mistake, e.g. rcu_data->dynticks_nmi_nesting inc 2, which causes
> rcu_is_cpu_rrupt_from_idle() unexpected.

Hmmm...

The fundamental questionss are:

1) Who is supposed to be responsible for doing the rcu entry/exit?

2) Is it supposed to matter if this happens multiple times?

For (1), I'd generally expect that this is supposed to happen in the
arch/common entry code, since that itself (or the irqchip driver) could
depend on RCU, and if that's the case thatn handle_domain_irq()
shouldn't need to call rcu_irq_enter(). That would be consistent with
the way we handle all other exceptions.

For (2) I don't know whether the level of nesting is suppoosed to
matter. I was under the impression it wasn't meant to matter in general,
so I'm a little surprised that rcu_is_cpu_rrupt_from_idle() depends on a
specific level of nesting.

>From a glance it looks like this would cause rcu_sched_clock_irq() to
skip setting TIF_NEED_RESCHED, and to not call invoke_rcu_core(), which
doesn't sound right, at least...

Thomas, Paul, thoughts?

AFAICT, s390 will have a similar flow on its IRQ handling path, so if
this is a real issue they'll be affected too.

Thanks,
Mark.

> Nmi also faces duplicate accounts. This series aims to address these
> duplicate issues.
> [1-2/5]: address nmi account duplicate
> [3-4/5]: address rcu housekeeping duplicate in irq
> [5/5]: as a natural result of [3-4/5], address a history issue. [1]
>
>
> History:
> v1 -> v2:
> change the subject as the motivation varies.
> add the fix for nmi account duplicate
>
> The subject of v1 is "[PATCH 1/3] kernel/irq: __handle_domain_irq()
> makes irq_enter/exit arch optional". [2] It is brought up to fix [1].
>
> There have been some tries to enable crash-stop-NMI on arm64, one by me,
> the other by Yuichi's [4]. I hope after this series, they can advance,
> as Marc said in [3] "No additional NMI patches will make it until we
> have resolved the issues"
>
> [1] https://lore.kernel.org/linux-arm-kernel/87lfewnmdz.fsf@xxxxxxxxxxxxxxxxxxxxxxx/
> [2] https://lore.kernel.org/linux-arm-kernel/1607912752-12481-1-git-send-email-kernelfans@xxxxxxxxx
> [3] https://lore.kernel.org/linux-arm-kernel/afd82be798cb55fd2f96940db7be78c0@xxxxxxxxxx
> [4] https://lore.kernel.org/linux-arm-kernel/20201104080539.3205889-1-ito-yuichi@xxxxxxxxxxx
>
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: Mark Rutland <mark.rutland@xxxxxxx>
> Cc: Marc Zyngier <maz@xxxxxxxxxx>
> Cc: Joey Gouly <joey.gouly@xxxxxxx>
> Cc: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
> Cc: Julien Thierry <julien.thierry@xxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Yuichi Ito <ito-yuichi@xxxxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> To: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>
>
> Pingfan Liu (5):
> arm64/entry-common: push the judgement of nmi ahead
> irqchip/GICv3: expose handle_nmi() directly
> kernel/irq: make irq_{enter,exit}() in handle_domain_irq() arch
> optional
> irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64
> irqchip/GICv3: make reschedule-ipi light weight
>
> arch/arm64/Kconfig | 1 +
> arch/arm64/include/asm/irq.h | 7 ++++
> arch/arm64/kernel/entry-common.c | 45 +++++++++++++++-------
> arch/arm64/kernel/irq.c | 29 ++++++++++++++
> drivers/irqchip/irq-gic-v3.c | 66 ++++++++++++++++++++------------
> kernel/irq/Kconfig | 3 ++
> kernel/irq/irqdesc.c | 4 ++
> 7 files changed, 116 insertions(+), 39 deletions(-)
>
> --
> 2.31.1
>