Re: [PATCH v3 3/4] perf record: apply affinity masks when reading mmap buffers

From: Alexey Budankov
Date: Thu Jan 10 2019 - 05:20:09 EST


On 10.01.2019 12:54, Jiri Olsa wrote:
> On Thu, Jan 10, 2019 at 12:41:55PM +0300, Alexey Budankov wrote:
>> On 09.01.2019 19:53, Jiri Olsa wrote:
>>> On Wed, Jan 09, 2019 at 12:38:23PM +0300, Alexey Budankov wrote:
>>>
>>> SNIP
>>>
>>>> diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
>>>> index e5220790f1fb..ee0230eed635 100644
>>>> --- a/tools/perf/util/mmap.c
>>>> +++ b/tools/perf/util/mmap.c
>>>> @@ -377,6 +377,24 @@ void perf_mmap__munmap(struct perf_mmap *map)
>>>> auxtrace_mmap__munmap(&map->auxtrace_mmap);
>>>> }
>>>>
>>>> +static void perf_mmap__setup_affinity_mask(struct perf_mmap *map, struct mmap_params *mp)
>>>> +{
>>>> + int c, cpu, nr_cpus, node;
>>>> +
>>>> + CPU_ZERO(&map->affinity_mask);
>>>> + if (mp->affinity == PERF_AFFINITY_NODE && cpu__max_node() > 1) {
>>>> + nr_cpus = cpu_map__nr(mp->cpu_map);
>>>> + node = cpu__get_node(map->cpu);
>>>> + for (c = 0; c < nr_cpus; c++) {
>>>> + cpu = mp->cpu_map->map[c]; /* map c index to online cpu index */
>>>> + if (cpu__get_node(cpu) == node)
>>>> + CPU_SET(cpu, &map->affinity_mask);
>>>
>>> should we do that from from all possible cpus task (perf record)
>>> can run on, instead of mp->cpu_map, which might be only subset
>>> (-C ... option)
>>
>> That is how it should be and because mp->cpu_map depends on -C option value
>> in this patch set version it requires to be corrected, possibly like this:
>>
>> struct mmap_params mp = {
>> .nr_cblocks = nr_cblocks,
>> .affinity = affinity,
>> .cpu_map = cpu_map__new(NULL) /* builds struct cpu_map from /sys/devices/system/cpu/online */
>> };
>> and
>> if (mp->affinity == PERF_AFFINITY_NODE && cpu__max_node() > 1 && mp->cpu_map)
>>
>> Thanks!
>>
>>>
>>> also node -> cpu_map is static configuration, we could prepare
>>> this map ahead (like cpunode_map) and just assign it in here
>>> based on node index
>>
>> It makes sense and either way is possible. However the static configuration
>> looks a bit trickier because it incurs additional mask objects duplication
>> and conversion from struct cpu_map to cpu_set_t still remains the same.
>
> ok, please at least put that node mask creation into separate function

Will do like this:

static void build_node_mask(const struct cpu_map *cpumap, int node, cpu_set_t *mask)

Thanks,
Alexey

>
> thanks,
> jirka
>