Re: [EXTERNAL] Re: [PATCH v7 5/7] coresight: tmc: Add support for reading crash data

From: James Clark
Date: Mon Apr 15 2024 - 05:28:53 EST




On 15/04/2024 05:01, Linu Cherian wrote:
> Hi James,
>
>> -----Original Message-----
>> From: James Clark <james.clark@xxxxxxx>
>> Sent: Friday, April 12, 2024 3:36 PM
>> To: Linu Cherian <lcherian@xxxxxxxxxxx>; Suzuki K Poulose
>> <suzuki.poulose@xxxxxxx>
>> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; coresight@xxxxxxxxxxxxxxxx; linux-
>> kernel@xxxxxxxxxxxxxxx; robh+dt@xxxxxxxxxx;
>> krzysztof.kozlowski+dt@xxxxxxxxxx; conor+dt@xxxxxxxxxx;
>> devicetree@xxxxxxxxxxxxxxx; Sunil Kovvuri Goutham
>> <sgoutham@xxxxxxxxxxx>; George Cherian <gcherian@xxxxxxxxxxx>; Anil
>> Kumar Reddy H <areddy3@xxxxxxxxxxx>; Tanmay Jagdale
>> <tanmay@xxxxxxxxxxx>; mike.leach@xxxxxxxxxx; leo.yan@xxxxxxxxxx
>> Subject: [EXTERNAL] Re: [PATCH v7 5/7] coresight: tmc: Add support for
>> reading crash data
>>
>> Prioritize security for external emails: Confirm sender and content safety
>> before clicking links or opening attachments
>>
>> ----------------------------------------------------------------------
>>
>>
>> On 07/03/2024 03:36, Linu Cherian wrote:
>>> * Introduce a new mode CS_MODE_READ_CRASHDATA for reading trace
>>> captured in previous crash/watchdog reset.
>>>
>>> * Add special device files for reading ETR/ETF crash data.
>>>
>>> * User can read the crash data as below
>>>
>>> For example, for reading crash data from tmc_etf sink
>>>
>>> #dd if=/dev/crash_tmc_etfXX of=~/cstrace.bin
>>>
>>> Signed-off-by: Anil Kumar Reddy <areddy3@xxxxxxxxxxx>
>>> Signed-off-by: Tanmay Jagdale <tanmay@xxxxxxxxxxx>
>>> Signed-off-by: Linu Cherian <lcherian@xxxxxxxxxxx>
>>> ---
>>> Changelog from v6:
>>> * Removed read_prevboot flag in sysfs
>>> * Added special device files for reading crashdata
>>> * Renamed CS mode READ_PREVBOOT to READ_CRASHDATA
>>> * Setting the READ_CRASHDATA mode is done as part of file open.
>>>
>>
>> [...]
>>
>>> @@ -619,6 +740,19 @@ static int tmc_probe(struct amba_device *adev,
>> const struct amba_id *id)
>>> coresight_unregister(drvdata->csdev);
>>> else
>>> pm_runtime_put(&adev->dev);
>>> +
>>> + if (!is_tmc_reserved_region_valid(dev))
>>> + goto out;
>>> +
>>> + drvdata->crashdev.name =
>>> + devm_kasprintf(dev, GFP_KERNEL, "%s_%s", "crash",
>> desc.name);
>>> + drvdata->crashdev.minor = MISC_DYNAMIC_MINOR;
>>> + drvdata->crashdev.fops = &tmc_crashdata_fops;
>>> + ret = misc_register(&drvdata->crashdev);
>>> + if (ret)
>>> + pr_err("%s: Failed to setup dev interface for crashdata\n",
>>> + desc.name);
>>> +
>>
>> Is this all optional after the is_tmc_reserved_region_valid()? Skipping to out
>> seems to be more like an error condition, but in this case it's not? Having it
>> like this makes it more difficult to add extra steps to the probe function. You
>> could move it to a function and flip the condition which would be clearer:
>>
>
> Ack.
>
>> if (is_tmc_reserved_region_valid(dev))
>> register_crash_dev_interface(drvdata);
>>
>>
>>> out:
>>> return ret;
>>> }
>>
>> [...]
>>
>>>
>>> +static int tmc_etr_setup_crashdata_buf(struct tmc_drvdata *drvdata) {
>>> + int rc = 0;
>>> + u64 trace_addr;
>>> + struct etr_buf *etr_buf;
>>> + struct etr_flat_buf *resrv_buf;
>>> + struct tmc_crash_metadata *mdata;
>>> + struct device *dev = &drvdata->csdev->dev;
>>> +
>>> + mdata = drvdata->crash_mdata.vaddr;
>>> +
>>> + etr_buf = kzalloc(sizeof(*etr_buf), GFP_ATOMIC);
>>> + if (!etr_buf) {
>>> + rc = -ENOMEM;
>>> + goto out;
>>> + }
>>> + etr_buf->size = drvdata->crash_tbuf.size;
>>> +
>>> + resrv_buf = kzalloc(sizeof(*resrv_buf), GFP_ATOMIC);
>>> + if (!resrv_buf) {
>>> + rc = -ENOMEM;
>>> + goto rmem_err;
>>> + }
>>> +
>>> + /*
>>> + * Buffer address given by metadata for retrieval of trace data
>>> + * from previous boot is expected to be same as the reserved
>>> + * trace buffer memory region provided through DTS
>>> + */
>>> + if (is_tmc_reserved_region_valid(dev->parent)
>>> + && (drvdata->crash_tbuf.paddr == mdata->trc_paddr))
>>> + resrv_buf->vaddr = drvdata->crash_tbuf.vaddr;
>>> + else {
>>> + dev_dbg(dev, "Trace buffer address of previous boot
>> invalid\n");
>>> + rc = -EINVAL;
>>> + goto map_err;
>>> + }
>>> +
>>> + resrv_buf->size = etr_buf->size;
>>> + resrv_buf->dev = &drvdata->csdev->dev;
>>> + etr_buf->hwaddr = trace_addr;
>>
>> trace_addr is uninitialised at this point. If you pull the latest coresight branch
>> we enabled some extra compiler warnings which catch this.
>>
>> I assume it's supposed to be mdata->trc_paddr?
>
>
> Since no DMA operation happens here, its supposed to be kept NULL.
> Since it was redundant for crash data read operation, missed catching this. Will fix this.
>
>>
>> Is there some kind of test that could be added that could have caught this?
>> Maybe skip the full reboot flow but just enable the feature and see if we can
>> read from the buffer.
>
> As pointed above this field is redundant for crashdata read mode. So its not a functionality breakage as such.
>

Ah ok that's not as bad as I thought then. I went back to version 4 or 5
and thought I saw it was assigned so assumed there was a mistake.

>>
>> Or even just toggling the mode on and off via sysfs. We're running the Perf
>> and kselftests on Juno internally so I can add a reserved region to the Juno DT
>> and make sure this stays working.
>
> Did you meant adding a kselftest for tracing and reading back tracedata in sysfs mode using the reserved region ?
>

Yes just enable, disable then read back. I don't think we need to do a
decode, but make sure a non zero size is read. And the test can be
skipped if the reserved region doesn't exist.

> Thanks
> Linu Cherian.
>