Re: [PATCH v3 2/8] hisi_ptt: Register PMU device for PTT trace

From: Yicong Yang
Date: Tue Feb 08 2022 - 02:41:47 EST


On 2022/2/7 19:42, Jonathan Cameron wrote:
> On Mon, 24 Jan 2022 21:11:12 +0800
> Yicong Yang <yangyicong@xxxxxxxxxxxxx> wrote:
>
>> Register PMU device of PTT trace, then users can use
>> trace through perf command. The driver makes use of perf
>> AUX trace and support following events to configure the
>> trace:
>>
>> - filter: select Root port or Endpoint to trace
>> - type: select the type of traced TLP headers
>> - direction: select the direction of traced TLP headers
>> - format: select the data format of the traced TLP headers
>>
>> This patch adds the PMU driver part of PTT trace. The perf
>> command support of PTT trace is added in the following
>> patch.
>>
>> Signed-off-by: Yicong Yang <yangyicong@xxxxxxxxxxxxx>
>> ---
>
>
>> @@ -294,6 +346,405 @@ static void hisi_ptt_init_ctrls(struct hisi_ptt *hisi_ptt)
>> hisi_ptt->trace_ctrl.default_cpu = cpumask_first(cpumask_of_node(dev_to_node(&pdev->dev)));
>> }
>>
>> +#define HISI_PTT_PMU_FILTER_IS_PORT BIT(19)
>> +#define HISI_PTT_PMU_FILTER_VAL_MASK GENMASK(15, 0)
>> +#define HISI_PTT_PMU_DIRECTION_MASK GENMASK(23, 20)
>> +#define HISI_PTT_PMU_TYPE_MASK GENMASK(31, 24)
>> +#define HISI_PTT_PMU_FORMAT_MASK GENMASK(35, 32)
>> +
>> +static ssize_t available_filters_show(struct device *dev,
>> + struct device_attribute *attr,
>> + char *buf)
>> +{
>> + struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev));
>> + struct hisi_ptt_filter_desc *filter;
>> + int pos = 0;
>> +
>> + if (list_empty(&hisi_ptt->port_filters))
>> + return sysfs_emit(buf, "#### No available filter ####\n");
>> +
>
> This is a very unusual sysfs attribute.
> They are supposed to be one "thing" per file, so I'd have expected this to
> be at least two files
>
> root_ports_available_filters
> request_available_filters
> and no available filter is indicated by these attribute returning an empty
> string.
>

Split it makes sense to me as indeed we're maintaining two list for root ports
and requester each. but perhaps below names are better?

available_root_port_filters
available_requester_filters

And feels we don't need the titles if we split it into two files, so it'll be like
$ cat available_root_port_filters
0000:00:10.0 0x80001
0000:00:11.0 0x80004
$ cat available_requester_filters
0000:01:00.0 0x00100
0000:01:00.1 0x00101

It's also better for script I think.

> However you need to match convention for hwtracing drivers so if
> this is common approach perhaps you could point me to a similar
> example? My grep skills didn't find me one.
>
>> + mutex_lock(&hisi_ptt->mutex);
>> + pos += sysfs_emit_at(buf, pos, "#### Root Ports ####\n");
>> + list_for_each_entry(filter, &hisi_ptt->port_filters, list)
>> + pos += sysfs_emit_at(buf, pos, "%s 0x%05lx\n",
>> + pci_name(filter->pdev),
>> + hisi_ptt_get_filter_val(filter->pdev) |
>> + HISI_PTT_PMU_FILTER_IS_PORT);
>> +
>> + pos += sysfs_emit_at(buf, pos, "#### Requesters ####\n");
>> + list_for_each_entry(filter, &hisi_ptt->req_filters, list)
>> + pos += sysfs_emit_at(buf, pos, "%s 0x%05x\n",
>> + pci_name(filter->pdev),
>> + hisi_ptt_get_filter_val(filter->pdev));
>> +
>> + mutex_unlock(&hisi_ptt->mutex);
>> + return pos;
>> +}
>> +static DEVICE_ATTR_ADMIN_RO(available_filters);
>> +
>
> ...
>
>
>> +static int hisi_ptt_trace_valid_config_onehot(u32 val, u32 *available_list, u32 list_size)
>> +{
>> + int i, ret = -EINVAL;
>> +
>> + for (i = 0; i < list_size; i++)
>> + if (val == available_list[i]) {
>> + ret = 0;
>
> return 0;
>

ok.

>> + break;
>> + }
>> +
>> + return ret;
>
> return -EINVAL;

ok.

>
>> +}
>> +
>
>> +
>> +static void hisi_ptt_pmu_free_aux(void *aux)
>> +{
>> + struct hisi_ptt_pmu_buf *buf = aux;
>> +
>> + vunmap(buf->base);
>> + kfree(buf);
>> +}
>> +
>
>
> ...
>
>> +static int hisi_ptt_pmu_add(struct perf_event *event, int flags)
>> +{
>> + struct hisi_ptt *hisi_ptt = to_hisi_ptt(event->pmu);
>> + struct hw_perf_event *hwc = &event->hw;
>> + int cpu = event->cpu;
>> +
>> + if (cpu == -1 && smp_processor_id() != hisi_ptt->trace_ctrl.default_cpu)
>
> This check is not entirely obvious to me. Perhaps a comment would help
> readers understand why this condition is successful, but doesn't involve
> actually starting the pmu?
>

Not sure I describe it correct and accurate. A perf session will add and start event on
each cpu, or only a range of cpus if user specified it by -C parameter of perf.
This information is passed to the PMU driver by the event->cpu, -1 indicates that user
didn't specify the cpu. This function will be called on every cpu or cpus specified by
the user.

Since we're not tracing CPUs and we don't need every cpu to start the trace, we add the
check here to only allow the event on the default cpu to start the trace. Other cpus
will just return. The default cpu is the 1st cpu of the NUMA nodes the PTT device locates.

>> + return 0;
>> +
>> + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
>> +
>> + if (flags & PERF_EF_START) {
>> + hisi_ptt_pmu_start(event, PERF_EF_RELOAD);
>> + if (hwc->state & PERF_HES_STOPPED)
>> + return -EINVAL;
>> + }
>> +
>> + return 0;
>> +}
>
> ...
>
>> /*
>> * The DMA of PTT trace can only use direct mapping, due to some
>> * hardware restriction. Check whether there is an iommu or the
>> @@ -359,6 +810,12 @@ static int hisi_ptt_probe(struct pci_dev *pdev,
>>
>> hisi_ptt_init_ctrls(hisi_ptt);
>>
>> + ret = hisi_ptt_register_pmu(hisi_ptt);
>> + if (ret) {
>> + pci_err(pdev, "failed to register pmu device, ret = %d", ret);
>
> Given I think this exposes userspace interfaces, it should be the very
> last thing done in probe(). Otherwise we have a race condition (at least in
> theory) where someone starts using it before we then fail the iommu mapping check.
>

thanks for catching this. I think it'll be a problem and I'll get the iommu mapping check
in advance.

Thanks.

>
>> + return ret;
>> + }
>> +
>> ret = hisi_ptt_check_iommu_mapping(hisi_ptt);
>> if (ret) {
>> pci_err(pdev, "cannot work with non-direct DMA mapping.\n");
>
> Thanks,
>
> Jonathan
>
> .
>