Re: [PATCH v8 01/13] tools/libperf: avoid moving of fds at fdarray__filter() call

From: Alexey Budankov
Date: Thu Jun 25 2020 - 15:32:37 EST



On 25.06.2020 20:14, Jiri Olsa wrote:
> On Wed, Jun 24, 2020 at 08:19:32PM +0300, Alexey Budankov wrote:
>>
>> On 17.06.2020 11:35, Alexey Budankov wrote:
>>>
>>> Skip fds with zeroed revents field from count and avoid fds moving
>>> at fdarray__filter() call so fds indices returned by fdarray__add()
>>> call stay the same and can be used for direct access and processing
>>> of fd revents status field at entries array of struct fdarray object.
>>>
>>> Signed-off-by: Alexey Budankov <alexey.budankov@xxxxxxxxxxxxxxx>
>>> ---
>>> tools/lib/api/fd/array.c | 11 +++++------
>>> tools/perf/tests/fdarray.c | 20 ++------------------
>>> 2 files changed, 7 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
>>> index 58d44d5eee31..97843a837370 100644
>>> --- a/tools/lib/api/fd/array.c
>>> +++ b/tools/lib/api/fd/array.c
>>> @@ -93,22 +93,21 @@ int fdarray__filter(struct fdarray *fda, short revents,
>>> return 0;
>>>
>>> for (fd = 0; fd < fda->nr; ++fd) {
>>> + if (!fda->entries[fd].revents)
>>> + continue;
>>> +
>>
>> So it looks like this condition also filters out non signaling events fds, not only
>> control and others fds, and this should be somehow avoided so such event related fds
>> would be counted. Several options have been proposed so far:
>>
>> 1) Explicit typing of fds via API extension and filtering based on the types:
>> a) with separate fdarray__add_stat() call
>> b) with type arg of existing fdarray__add() call
>> c) various memory management design is possible
>>
>> 2) Playing tricks with fd positions inside entries and assumptions on fdarray API calls ordering
>> - looks more like a hack than a designed solution
>>
>> 3) Rewrite of fdarray class to allocate separate object for every added fds
>> - can be replaced with nonscrewing of fds by __filter()
>>
>> 4) Distinct between fds types at fdarray__filter() using .revents == 0 condition
>> - seems to have corner cases and thus not applicable
>>
>> 5) Extension of fdarray__poll(, *arg_ptr, arg_size) with arg of fds array to atomically poll
>> on fdarray_add()-ed fds and external arg fds and then external arg fds processing
>>
>> 6) Rewrite of fdarray class on epoll() call basis
>> - introduces new scalability restrictions for Perf tool
>
> hum, how many fds for polling do you expect in your workloads?

Currently it is several hundreds so default of 1K is easily hit and
"Profile a Large Number of PMU Events on Multi-Core Systems" section [1]
recommends:

soft nofile 65535
hard nofile 65535

for for /etc/security/limits.conf settings.

~Alexey

[1] https://software.intel.com/content/www/us/en/develop/documentation/vtune-cookbook/top/configuration-recipes/profiling-hardware-without-sampling-drivers.html

>
> jirka
>