Re: [RFC PATCH 3/6] soc: qcom: Add Qualcomm minidump kernel driver

From: Mukesh Ojha
Date: Wed Mar 15 2023 - 11:24:04 EST


Thanks for the review and comment.

On 3/9/2023 2:20 AM, Konrad Dybcio wrote:


On 8.03.2023 21:22, Srinivas Kandagatla wrote:


On 21/02/2023 11:25, Mukesh Ojha wrote:
Minidump is a best effort mechanism to collect useful and predefined
data for first level of debugging on end user devices running on
Qualcomm SoCs. It is built on the premise that System on Chip (SoC)
or subsystem part of SoC crashes, due to a range of hardware and
software bugs. Hence, the ability to collect accurate data is only
a best-effort. The data collected could be invalid or corrupted,
data collection itself could fail, and so on.

Qualcomm devices in engineering mode provides a mechanism for
generating full system ramdumps for post mortem debugging. But in some
cases it's however not feasible to capture the entire content of RAM.
The minidump mechanism provides the means for selecting region should
be included in the ramdump. The solution supports extracting the
ramdump/minidump produced either over USB or stored to an attached
storage device.

The core of minidump feature is part of Qualcomm's boot firmware code.
It initializes shared memory(SMEM), which is a part of DDR and
allocates a small section of it to minidump table i.e also called
global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
their own table of segments to be included in the minidump, all
references from a descriptor in SMEM (G-ToC). Each segment/region has
some details like name, physical address and it's size etc. and it
could be anywhere scattered in the DDR.

Minidump kernel driver adds the capability to add linux region to be
dumped as part of ram dump collection. It provides appropriate symbol
to check its enablement and register client regions.

To simplify post mortem debugging, it creates and maintain an ELF
header as first region that gets updated with upon registration
of a new region.

Signed-off-by: Mukesh Ojha <quic_mojha@xxxxxxxxxxx>
---
[...]

+int qcom_minidump_ready(void)
+{
+    void *ptr;
+    struct device_node *np;
+    static bool is_smem_available = true;
+
+    if (!is_smem_available || !(np = of_find_compatible_node(NULL, NULL, "qcom,smem"))) {

just check for dt node here does not mean that smem device is available, you should probably check if the device is avaliable aswell using of_device_is_available()


We should proabably return -EPROBEDEFER incase the node is present and device is not present.
qcom_smem_get() seems to handle -EPROBE_DEFER internally, so this check
may be entirely redundant.

The main idea behind checking qcom,smem availability is to not stop client(core kernel) probe if the smem dt node is not present at all
and qcom_smem_get() will always return -EPROBE_DEFER for such cases.

For e.g: Again i am taking ramoops example which seems relevant for
this case, ramoops driver should still probe and not defer forever
if the smem node is not available at all for non-qcom device tree.

-Mukesh

Konrad