Re: [PATCH v4 04/19] selftests/resctrl: Close perf value read fd on errors

From: Ilpo Järvinen
Date: Fri Jul 14 2023 - 06:35:32 EST


On Thu, 13 Jul 2023, Reinette Chatre wrote:

> Hi Ilpo,
>
> On 7/13/2023 6:19 AM, Ilpo Järvinen wrote:
> > Perf event fd (fd_lm) is not closed on some error paths.
> >
> > Always close fd_lm in get_llc_perf() and add close into an error
> > handling block in cat_val().
> >
> > Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest")
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> > ---
> > tools/testing/selftests/resctrl/cache.c | 10 +++++-----
> > 1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
> > index 8a4fe8693be6..ced47b445d1e 100644
> > --- a/tools/testing/selftests/resctrl/cache.c
> > +++ b/tools/testing/selftests/resctrl/cache.c
> > @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no)
> > static int get_llc_perf(unsigned long *llc_perf_miss)
> > {
> > __u64 total_misses;
> > + int ret;
> >
> > /* Stop counters after one span to get miss rate */
> >
> > ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
> >
> > - if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) {
> > + ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
> > + close(fd_lm);
> > + if (ret == -1) {
> > perror("Could not get llc misses through perf");
> > -
> > return -1;
> > }
> >
> > total_misses = rf_cqm.values[0].value;
> > -
> > - close(fd_lm);
> > -
> > *llc_perf_miss = total_misses;
> >
> > return 0;
> > @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param)
> > memflush, operation, resctrl_val)) {
> > fprintf(stderr, "Error-running fill buffer\n");
> > ret = -1;
> > + close(fd_lm);
> > break;
> > }
> >
>
> Instead of fixing these existing patterns I think it would make the code
> easier to understand and maintain if it is made symmetrical.
> Having the perf event fd opened in one place but its close()
> scattered elsewhere has the potential for confusion and making later
> mistakes easy to miss.
>
> What if perf event fd is closed in a new "disable_llc_perf()" that
> is matched with "reset_enable_llc_perf()" and called
> from cat_val()?
>
> I think this raises another issue with the test trickery where
> measure_cache_vals() has some assumptions about state based on the
> test name.

I very much agree on the principle here, and thus I already have created
patches which will do a major cleanup on this area. The cleaned-up code
has pe_fd local var to cat_val() and handles closing it in cat_val() with
the usual patterns.

However, the patch is currently resides post L3 CAT test rewrite.
Backporting the cleanups/refactors into this series would require
considerable effort due to how convoluted all those n-step cleanup patches
and L3 CAT test rewrite are in this area. There's just very much to
cleanup here and L3 rewrite will touch the same areas so its a net
full of conflicts.

Do you want me to spend the effort to backport them into this series
(I expect will take some time)?

I currently have these items pending besides this series (in order):
- L3 CAT test rewrite and its preparatory patches
- More cleanups (including the pe_fd cleanup)
- New generalized test framework
- L2 CAT test

--
i.