Re: [PATCH] arm64: traps: disable irq in die()

From: Will Deacon
Date: Tue Jul 04 2017 - 13:18:08 EST


On Wed, Jun 28, 2017 at 05:04:12PM +0800, Qiao Zhou wrote:
> In current die(), the irq is disabled for __die() handle, not
> including the possible panic() handling. Since the log in __die()
> can take several hundreds ms, new irq might come and interrupt
> current die().
>
> If the process calling die() holds some critical resource, and some
> other process scheduled later also needs it, then it would deadlock.
> The first panic will not be executed.
>
> So here disable irq for the whole flow of die().

Could you give an example of this going wrong, please?

>
> Signed-off-by: Qiao Zhou <qiaozhou@xxxxxxxxxxxx>
> ---
> arch/arm64/kernel/traps.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 0805b44..b12bf0f 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -274,10 +274,13 @@ static DEFINE_RAW_SPINLOCK(die_lock);
> void die(const char *str, struct pt_regs *regs, int err)
> {
> int ret;
> + unsigned long flags;
> +
> + local_irq_save(flags);
>
> oops_enter();
>
> - raw_spin_lock_irq(&die_lock);
> + raw_spin_lock(&die_lock);

Can we instead move the taking of the die_lock before oops_enter, or does
that break something else?

> console_verbose();
> bust_spinlocks(1);
> ret = __die(str, err, regs);
> @@ -287,13 +290,16 @@ void die(const char *str, struct pt_regs *regs, int err)
>
> bust_spinlocks(0);
> add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
> - raw_spin_unlock_irq(&die_lock);
> + raw_spin_unlock(&die_lock);
> oops_exit();
>
> if (in_interrupt())
> panic("Fatal exception in interrupt");
> if (panic_on_oops)
> panic("Fatal exception");
> +
> + local_irq_restore(flags);

We could also move the unlock_irq down here.

Will