Re: [RFC][PATCH v3 2/3] Skip spin_locks in panic case and addWARN_ON()

From: Don Zickus
Date: Mon Dec 12 2011 - 10:59:00 EST


On Fri, Dec 02, 2011 at 05:10:26PM -0500, Seiji Aguchi wrote:
> Patch Description:
> - Skip spin_locks in panic case in both kmsg_dump() and pstore_dump() because they are
> serialized via smp_send_stop
>
> - Add WARN_ON() in "in_nmi() and !panic" case into kmsg_dump(). Currently, this case never
> happens because only kmsg_dump(KMSG_DUMP_PANIC) is called in NMI case.
> But if someone adds new kmsg_dump() into NMI path in the future, kmsg_dump() may deadlock.
> We can trap it and complain with this WARN_ON().
>
>
> With this patch, kmsg_dump()/pstore_dump() work as follows.
> panic case (KMSG_DUMP_PANIC):
> - don't take lock because they are serialized.
>
> not panic case (KMSG_DUMP_OOPS/KMSG_DUMP_EMERG/KMSG_DUMP_RESTART/KMSG_DUMP_HALT):
> - take locks normally
>
> Regarding as NMI case,
> - kmsg_dump()/pstore_dump() don't take locks, so deadlock issue will not happen
> because kmsg_dump() is called in just panic case with current implementation.
> - If someone adds new kmsg_dump() into NMI path, WARN_ON() is called.
> So we can trap it and ask to fix it.
>
> Signed-off-by: Seiji Aguchi <seiji.aguchi@xxxxxxx>
>
> ---
> fs/pstore/platform.c | 16 ++++++----------
> kernel/printk.c | 13 +++++++++++--
> 2 files changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index 57bbf90..823669e 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -90,18 +90,17 @@ static void pstore_dump(struct kmsg_dumper *dumper,
> int hsize, ret;
> unsigned int part = 1;
> unsigned long flags = 0;
> - int is_locked = 0;
>
> if (reason < ARRAY_SIZE(reason_str))
> why = reason_str[reason];
> else
> why = "Unknown";
>
> - if (in_nmi()) {
> - is_locked = spin_trylock(&psinfo->buf_lock);
> - if (!is_locked)
> - pr_err("pstore dump routine blocked in NMI, may corrupt error record\n");
> - } else
> + /*
> + * pstore_dump() is serialized in panic path.
> + * So, we don't need to take any locks.
> + */
> + if (reason != KMSG_DUMP_PANIC)
> spin_lock_irqsave(&psinfo->buf_lock, flags);

This probably won't work because the original check wasn't necessarily for
locking purposes but to see if a lock was already taken in the serialized
path. If buf_lock is already held, the code may have trouble executing
the backend, hence the pr_err.


> oopscount++;
> while (total < kmsg_bytes) {
> @@ -131,10 +130,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
> total += l1_cpy + l2_cpy;
> part++;
> }
> - if (in_nmi()) {
> - if (is_locked)
> - spin_unlock(&psinfo->buf_lock);
> - } else
> + if (reason != KMSG_DUMP_PANIC)
> spin_unlock_irqrestore(&psinfo->buf_lock, flags);
> }
>
> diff --git a/kernel/printk.c b/kernel/printk.c
> index 1455a0d..bc5ac61 100644
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -1732,13 +1732,22 @@ void kmsg_dump(enum kmsg_dump_reason reason)
> unsigned long l1, l2;
> unsigned long flags;
>
> + WARN_ON(in_nmi() && reason != KMSG_DUMP_PANIC);
> +

You will need a comment to explain why the above is there.

> /* Theoretically, the log could move on after we do this, but
> there's not a lot we can do about that. The new messages
> will overwrite the start of what we dump. */
> - raw_spin_lock_irqsave(&logbuf_lock, flags);
> +
> + /*
> + * kmsg_dump(KMSG_DUMP_PANIC) is serialized.
> + * So, we don't need to take any locks.
> + */
> + if (reason != KMSG_DUMP_PANIC)
> + raw_spin_lock_irqsave(&logbuf_lock, flags);
> end = log_end & LOG_BUF_MASK;
> chars = logged_chars;
> - raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> + if (reason != KMSG_DUMP_PANIC)
> + raw_spin_unlock_irqrestore(&logbuf_lock, flags);
>
> if (chars > end) {
> s1 = log_buf + log_buf_len - chars + end;

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/