Re: [RFC PATCH 0/7] Trace events to pstore

From: Joel Fernandes
Date: Wed Sep 02 2020 - 17:55:08 EST


On Wed, Sep 2, 2020 at 5:47 PM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Sep 2, 2020 at 4:01 PM Nachammai Karuppiah
> <nachukannan@xxxxxxxxx> wrote:
> >
> > Hi,
> >
> > This patch series adds support to store trace events in pstore.
> >

Been a long day...

> > Storing trace entries in persistent RAM would help in understanding what
> > happened just before the system went down. The trace events that led to the
> > crash can be retrieved from the pstore after a warm reboot. This will help
> > debug what happened before machine’s last breath. This has to be done in a
> > scalable way so that tracing a live system does not impact the performance
> > of the system.
>
> Just to add, Nachammai was my intern in the recent outreachy program
> and we designed together a way for trace events to be written to
> pstore backed memory directory instead of regular memory. The basic

s/directory/directly/

> idea is to allocate frace's ring buffer on pstore memory and have it
> right there. Then recover it on reboot. Nachammai wrote the code with

s/right/write/

- Joel


> some guidance :) . I talked to Steve as well in the past about the
> basic of idea of this. Steve is on vacation this week though.
>
> This is similar to what +Sai Prakash Ranjan was trying to do sometime
> ago: https://lkml.org/lkml/2018/9/8/221 . But that approach involved
> higher overhead due to synchronization of writing to the otherwise
> lockless ring buffer.
>
> +Brian Norris has also expressed interest for this feature.
>
> thanks,
>
> - Joel
>
> >
> > This requires a new backend - ramtrace that allocates pages from
> > persistent storage for the tracing utility. This feature can be enabled
> > using TRACE_EVENTS_TO_PSTORE.
> > In this feature, the new backend is used only as a page allocator and
> > once the users chooses to use pstore to record trace entries, the ring
> > buffer pages are freed and allocated in pstore. Once this switch is done,
> > ring_buffer continues to operate just as before without much overhead.
> > Since the ring buffer uses the persistent RAM buffer directly to record
> > trace entries, all tracers would also persist across reboot.
> >
> > To test this feature, I used a simple module that would call panic during
> > a write operation to file in tracefs directory. Before writing to the file,
> > the ring buffer is moved to persistent RAM buffer through command line
> > as shown below,
> >
> > $echo 1 > /sys/kernel/tracing/options/persist
> >
> > Writing to the file,
> > $echo 1 > /sys/kernel/tracing/crash/panic_on_write
> >
> > The above write operation results in system crash. After reboot, once the
> > pstore is mounted, the trace entries from previous boot are available in file,
> > /sys/fs/pstore/trace-ramtrace-0
> >
> > Looking through this file, gives us the stack trace that led to the crash.
> >
> > <...>-1 [001] .... 49.083909: __vfs_write <-vfs_write
> > <...>-1 [001] .... 49.083933: panic <-panic_on_write
> > <...>-1 [001] d... 49.084195: printk <-panic
> > <...>-1 [001] d... 49.084201: vprintk_func <-printk
> > <...>-1 [001] d... 49.084207: vprintk_default <-printk
> > <...>-1 [001] d... 49.084211: vprintk_emit <-printk
> > <...>-1 [001] d... 49.084216: __printk_safe_enter <-vprintk_emit
> > <...>-1 [001] d... 49.084219: _raw_spin_lock <-vprintk_emit
> > <...>-1 [001] d... 49.084223: vprintk_store <-vprintk_emit
> >
> > Patchwise oneline description is given below:
> >
> > Patch 1 adds support to allocate ring buffer pages from persistent RAM buffer.
> >
> > Patch 2 introduces a new backend, ramtrace.
> >
> > Patch 3 adds methods to read previous boot pages from pstore.
> >
> > Patch 4 adds the functionality to allocate page-sized memory from pstore.
> >
> > Patch 5 adds the seq_operation methods to iterate through trace entries.
> >
> > Patch 6 modifies ring_buffer to allocate from ramtrace when pstore is used.
> >
> > Patch 7 adds ramtrace DT node as child-node of /reserved-memory.
> >
> > Nachammai Karuppiah (7):
> > tracing: Add support to allocate pages from persistent memory
> > pstore: Support a new backend, ramtrace
> > pstore: Read and iterate through trace entries in PSTORE
> > pstore: Allocate and free page-sized memory in persistent RAM buffer
> > tracing: Add support to iterate through pages retrieved from pstore
> > tracing: Use ramtrace alloc and free methods while using persistent
> > RAM
> > dt-bindings: ramtrace: Add ramtrace DT node
> >
> > .../bindings/reserved-memory/ramtrace.txt | 13 +
> > drivers/of/platform.c | 1 +
> > fs/pstore/Makefile | 2 +
> > fs/pstore/inode.c | 46 +-
> > fs/pstore/platform.c | 1 +
> > fs/pstore/ramtrace.c | 821 +++++++++++++++++++++
> > include/linux/pstore.h | 3 +
> > include/linux/ramtrace.h | 28 +
> > include/linux/ring_buffer.h | 19 +
> > include/linux/trace.h | 13 +
> > kernel/trace/Kconfig | 10 +
> > kernel/trace/ring_buffer.c | 663 ++++++++++++++++-
> > kernel/trace/trace.c | 312 +++++++-
> > kernel/trace/trace.h | 5 +-
> > 14 files changed, 1924 insertions(+), 13 deletions(-)
> > create mode 100644 Documentation/devicetree/bindings/reserved-memory/ramtrace.txt
> > create mode 100644 fs/pstore/ramtrace.c
> > create mode 100644 include/linux/ramtrace.h
> >
> > --
> > 2.7.4
> >