Re: [PATCH 1/4] hwtracing: hisi_ptt: Make cpumask only present online CPUs

From: Yicong Yang
Date: Wed Mar 29 2023 - 23:53:22 EST


On 2023/3/29 0:24, Jonathan Cameron wrote:
> On Wed, 15 Mar 2023 17:43:13 +0800
> Yicong Yang <yangyicong@xxxxxxxxxx> wrote:
>
>> From: Yicong Yang <yangyicong@xxxxxxxxxxxxx>
>>
>> perf will try to start PTT trace on every CPU presented in cpumask sysfs
>> attribute and it will fail to start on offline CPUs(see the comments in
>> perf_event_open()). But the driver is using cpumask_of_node() to export
>> the available cpumask which may include offline CPUs and may fail the
>> perf unintendedly. Fix this by only export the online CPUs of the node.
>
> There isn't clear documentation that I can find for cpumask_of_node()
> and chasing through on arm64 (which is what we care about for this driver)
> it's maintained via numa_add_cpu() numa_remove_cpu()
> Those are called in arch/arm64/kernel/smp.c in locations that are closely coupled
> with set_cpu_online(cpu, XXX);
> https://elixir.bootlin.com/linux/v6.3-rc4/source/arch/arm64/kernel/smp.c#L246
> https://elixir.bootlin.com/linux/v6.3-rc4/source/arch/arm64/kernel/smp.c#L303
>
> Now there are races when the two might not be in sync but in this case
> we are just exposing the result to userspace, so chances of a race
> after this sysfs attribute has been read seems much higher to me and
> I don't think we can do anything about that.
>
> Is there another path that I'm missing where online and node masks are out
> of sync?
>

maybe no. This patch maybe incorrect and I need more investigation, so let's me
drop it from the series. Tested and everything seems fine now.

I found this problem and referred to commit 064f0e9302af ("mm: only display online cpus of the numa node")
which might be the same problem. But seems unnecessary that cpumask_of_node()
already include online CPUs only.

Thanks.

> Jonathan
>
>
>>
>> Fixes: ff0de066b463 ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device")
>> Signed-off-by: Yicong Yang <yangyicong@xxxxxxxxxxxxx>
>
>> ---
>> drivers/hwtracing/ptt/hisi_ptt.c | 13 +++++++++++--
>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
>> index 30f1525639b5..0a10c7ec46ad 100644
>> --- a/drivers/hwtracing/ptt/hisi_ptt.c
>> +++ b/drivers/hwtracing/ptt/hisi_ptt.c
>> @@ -487,9 +487,18 @@ static ssize_t cpumask_show(struct device *dev, struct device_attribute *attr,
>> char *buf)
>> {
>> struct hisi_ptt *hisi_ptt = to_hisi_ptt(dev_get_drvdata(dev));
>> - const cpumask_t *cpumask = cpumask_of_node(dev_to_node(&hisi_ptt->pdev->dev));
>> + cpumask_var_t mask;
>> + ssize_t n;
>>
>> - return cpumap_print_to_pagebuf(true, buf, cpumask);
>> + if (!alloc_cpumask_var(&mask, GFP_KERNEL))
>> + return 0;
>> +
>> + cpumask_and(mask, cpumask_of_node(dev_to_node(&hisi_ptt->pdev->dev)),
>> + cpu_online_mask);
>> + n = cpumap_print_to_pagebuf(true, buf, mask);
>> + free_cpumask_var(mask);
>> +
>> + return n;
>> }
>> static DEVICE_ATTR_RO(cpumask);
>>
>
> .
>