Re: [PATCH V1 11/13] selftests/resctrl: Change Cache Quality Monitoring (CQM) test

From: Sai Praneeth Prakhya
Date: Wed Mar 11 2020 - 13:38:42 EST


Hi Reinette,

On Wed, 2020-03-11 at 10:19 -0700, Reinette Chatre wrote:
> Hi Sai,
>
> On 3/10/2020 7:46 PM, Sai Praneeth Prakhya wrote:
> > On Tue, 2020-03-10 at 15:18 -0700, Reinette Chatre wrote:
> > > On 3/6/2020 7:40 PM, Sai Praneeth Prakhya wrote:
> > > > .mum_resctrlfs = 0,
> > > > .filename = RESULT_FILE_NAME,
> > > > - .mask = ~(long_mask << n) & long_mask,
> > > > - .span = cache_size * n / count_of_bits,
> > > > .num_of_runs = 0,
> > > > - .setup = cqm_setup,
> > > > + .setup = cqm_setup
> > > > };
> > > > + int ret;
> > > > + char schemata[64];
> > > > + unsigned long long_mask;
> > > >
> > > > - if (strcmp(benchmark_cmd[0], "fill_buf") == 0)
> > > > - sprintf(benchmark_cmd[1], "%lu", param.span);
> > > > + ret = remount_resctrlfs(1);
> > > > + if (ret)
> > > > + return ret;
> > >
> > > Here resctrl is remounted and followed by some changes to the root
> > > group's schemata. That is followed by a call to resctrl_val that
> > > attempts to remount resctrl again that will undo all the configurations
> > > inbetween.
> >
> > No, it wouldn't because mum_resctrlfs is 0. When resctrl FS is already
> > mounted
> > and mum_resctrlfs is 0, then remount_resctrlfs() is a noop.
> >
>
> I missed that. Thank you.
>
> fyi ... when I tried these tests I encountered the following error
> related to unmounting:
>
> [SNIP]
> ok Write schema "L3:1=7fff" to resctrl FS
> ok Write schema "L3:1=ffff" to resctrl FS
> ok Write schema "L3:1=1ffff" to resctrl FS
> ok Write schema "L3:1=3ffff" to resctrl FS
> # Unable to umount resctrl: Device or resource busy
> # Results are displayed in (Bytes)
> ok CQM: diff within 5% for mask 1
> # alloc_llc_cache_size: 2883584
> # avg_llc_occu_resc: 2973696
> ok CQM: diff within 5% for mask 3
> [SNIP]
>
> This seems to originate from resctrl_val() that forces an unmount but if
> that fails the error is not propagated.

Yes, that's right and it's a good test. I didn't encounter this issue during
my testing because I wasn't using resctrl FS from other terminals (I think you
were using resctrl FS from other terminal and hence resctrl_test was unable to
unmount it).

I think the error should not be propagated because unmounting resctrl FS
shouldn't stop us from checking the results. If measuring values reports an
error then we shouldn't check for results.

> > > > diff --git a/tools/testing/selftests/resctrl/resctrl_val.c
> > > > b/tools/testing/selftests/resctrl/resctrl_val.c
> > > > index 271cb5c976f5..c59fad6cb9b0 100644
> > > > --- a/tools/testing/selftests/resctrl/resctrl_val.c
> > > > +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> > > > @@ -705,29 +705,21 @@ int resctrl_val(char **benchmark_cmd, struct
> > > > resctrl_val_param *param)
> > > > goto out;
> > > > }
> > > >
> > > > - /* Give benchmark enough time to fully run */
> > > > - sleep(1);
> > > > -
> > > > /* Test runs until the callback setup() tells the test to
> > > > stop. */
> > > > while (1) {
> > > > + ret = param->setup(param);
> > > > + if (ret) {
> > > > + ret = 0;
> > > > + break;
> > > > + }
> > > > +
> > > > + /* Measure vals sleeps for a second */
> > > > if ((strcmp(resctrl_val, "mbm") == 0) ||
> > > > (strcmp(resctrl_val, "mba") == 0)) {
> > > > - ret = param->setup(param);
> > > > - if (ret) {
> > > > - ret = 0;
> > > > - break;
> > > > - }
> > > > -
>
> (I refer to the above snippet in my comment below)
>
> > > > ret = measure_vals(param, &bw_resc_start);
> > > > if (ret)
> > > > break;
> > > > } else if (strcmp(resctrl_val, "cqm") == 0) {
> > > > - ret = param->setup(param);
> > > > - if (ret) {
> > > > - ret = 0;
> > > > - break;
> > > > - }
> > > > - sleep(1);
> > > > ret = measure_cache_vals(param, bm_pid);
> > > > if (ret)
> > > > break;
> > >
> > > This change affects not just the cache monitoring test. Could this
> > > change be extracted into its own patch to be clear what is done here and
> > > how it impacts the other tests?
> >
> > This change shouldn't impact other tests (i.e. CAT) because CAT will not
> > call
> > resctrl_val().
>
> I was referring to the snippet above that seems to impact the "mbm" and
> "mba" tests by moving the call to "param->setup" for the them.

Ok.. makes sense. Sure! I will make it into separate patch.

Regards,
Sai