Re: [PATCH V1 11/13] selftests/resctrl: Change Cache Quality Monitoring (CQM) test

From: Sai Praneeth Prakhya
Date: Wed Mar 11 2020 - 14:12:38 EST


Hi Reinette,

On Wed, 2020-03-11 at 11:03 -0700, Reinette Chatre wrote:
> > > > > >

[SNIP]

> Hi Sai,
>
> On 3/11/2020 10:33 AM, Sai Praneeth Prakhya wrote:
> > On Wed, 2020-03-11 at 10:19 -0700, Reinette Chatre wrote:
> > > On 3/10/2020 7:46 PM, Sai Praneeth Prakhya wrote:
> > > > On Tue, 2020-03-10 at 15:18 -0700, Reinette Chatre wrote:
> > > > > On 3/6/2020 7:40 PM, Sai Praneeth Prakhya wrote:
> > > I missed that. Thank you.
> > >
> > > fyi ... when I tried these tests I encountered the following error
> > > related to unmounting:
> > >
> > > [SNIP]
> > > ok Write schema "L3:1=7fff" to resctrl FS
> > > ok Write schema "L3:1=ffff" to resctrl FS
> > > ok Write schema "L3:1=1ffff" to resctrl FS
> > > ok Write schema "L3:1=3ffff" to resctrl FS
> > > # Unable to umount resctrl: Device or resource busy
> > > # Results are displayed in (Bytes)
> > > ok CQM: diff within 5% for mask 1
> > > # alloc_llc_cache_size: 2883584
> > > # avg_llc_occu_resc: 2973696
> > > ok CQM: diff within 5% for mask 3
> > > [SNIP]
> > >
> > > This seems to originate from resctrl_val() that forces an unmount but if
> > > that fails the error is not propagated.
> >
> > Yes, that's right and it's a good test. I didn't encounter this issue
> > during
> > my testing because I wasn't using resctrl FS from other terminals (I think
> > you
> > were using resctrl FS from other terminal and hence resctrl_test was
> > unable to
> > unmount it).
>
> I was not explicitly testing for this but this may have been the case.
>
> As a sidenote ... could remount_resctrlfs() be called consistently? It
> seems to switch between being called with true/false and 1/0. Since its
> parameter type is boolean using true/false seems most appropriate.

Agreed and make sense. I will fix this in a separate patch.

> > I think the error should not be propagated because unmounting resctrl FS
> > shouldn't stop us from checking the results. If measuring values reports
> > an
> > error then we shouldn't check for results.
>
> This sounds right. It is inconsistent though ... the CQM test unmounts
> resctrl after it is run but the CAT test does not. Looking closer the
> CAT test seems to leave its artifacts around in resctrl and this should
> be cleaned up.

Yes makes sense. I will fix CAT test to cleanup things.

> I am not sure about the expectations here. Unmounting resctrl after a
> test is run is indeed the easiest to clean up and may be ok.

The main reason for unmounting is that assume user hasn't mounted resctrl FS
before running the test then we want to make sure we get back to the same
state as before running test and also to clean up any changes made to resctrl
FS during test.

> It may be a
> surprise to the user though. Perhaps there can be a snippet in the
> README that warns people about this?

Sure! makes sense. I will add it.

Regards,
Sai