Re: [PATCH v13 0/8] Coresight for Kernel panic and watchdog reset
From: Linu Cherian
Date: Tue Feb 04 2025 - 07:05:01 EST
Hi James,
On 2025-01-24 at 17:38:58, James Clark (james.clark@xxxxxxxxxx) wrote:
>
>
> On 16/12/2024 5:30 am, Linu Cherian wrote:
> > This patch series is rebased on coresight-next-v6.12.rc4
> >
> > * Patches 1 & 2 adds support for allocation of trace buffer pages from
> > reserved RAM
> > * Patches 3 & 4 adds support for saving metadata at the time of kernel panic
> > * Patch 5 adds support for reading trace data captured at the time of panic
> > * Patches 6 & 7 adds support for disabling coresight blocks at the time of panic
> > * Patch 8: Gives the full description about this feature as part of documentation
> >
> > v12 is posted here,
> > https://lore.kernel.org/linux-arm-kernel/20241129084714.3057080-1-lcherian@xxxxxxxxxxx/
> >
> > Changelog from v12:
> > * Fixed wrong buffer pointer passed to coresigh_insert_barrier_packet
> > * tmc_read_prepare/unprepare_crashdata need to be called only once and
> > hence removed from read path and added to tmc_probe
> > * tmc_read_prepare_crashdata renamed to tmc_prepare_crashdata and
> > avoid taking locks as its moved to probe function.
> > * Introduced read status flag, "reading" specific to reserved buffer to keep the
> > reserved buffer reading independent of the regular buffer.
> > * open/release ops for reserved buffer has to take care only about the
> > set/unset the "reading" status flag as the reserved buffer is prepared
> > during the probe time itself.
> > * Few other trivial changes
> >
>
> Hi Linu,
>
> I tested that decoding a crash dump of ETM1 (trace ID 17) from panic kernel
> works:
>
> $ ./ptm2human -i cstrace.bin
>
> ...
> There is no valid data in the stream of ID 16
> Decode trace stream of ID 17
> Syncing the trace stream...
> Decoding the trace stream...
> instruction addr at 0x140c9afc, ARM state, secure state,
> ...
Thanks for trying this out.
>
> I noticed that once in the panic kernel Coresight becomes unusable, and the
> Perf Coresight tests fail, with no obvious way to reset it other than a cold
> boot:
>
> $ perf record -e cs_etm//u -- true
> $ perf report -D | grep AUX
> ...
> AUX data lost 27 times out of 27!
> ...
>
> I didn't debug it yet. I thought it might be something to do with the RESRV
> buffer mode, but it doesn't look like that should be the case from the code.
> Perhaps its the claim tags and coresight_is_claimed_any() lingering, so it's
> not really an issue that's introduced by this change?
Is that problem reproducible without this series applied ?
Thanks.
Linu Cherian.