Re: [PATCH v2 1/5] selftests/sgx: Retry the ioctl()'s returned with EAGAIN
From: Jarkko Sakkinen
Date: Mon Sep 12 2022 - 06:40:52 EST
On Fri, Sep 09, 2022 at 07:01:36AM +0300, Jarkko Sakkinen wrote:
> On Thu, Sep 08, 2022 at 05:06:58PM -0700, Reinette Chatre wrote:
> > Hi Jarkko,
> >
> > On 9/8/2022 4:19 PM, Jarkko Sakkinen wrote:
> > > On Thu, Sep 08, 2022 at 03:43:06PM -0700, Reinette Chatre wrote:
> > >> Hi Jarkko and Haitao,
> > >>
> > >> On 9/4/2022 7:04 PM, Jarkko Sakkinen wrote:
> > >>> From: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
> > >>>
> > >>> For EMODT and EREMOVE ioctl()'s with a large range, kernel
> > >>> may not finish in one shot and return EAGAIN error code
> > >>> and count of bytes of EPC pages on that operations are
> > >>> finished successfully.
> > >>>
> > >>> Change the unclobbered_vdso_oversubscribed_remove test
> > >>> to rerun the ioctl()'s in a loop, updating offset and length
> > >>> using the byte count returned in each iteration.
> > >>>
> > >>> Fixes: 6507cce561b4 ("selftests/sgx: Page removal stress test")
> > >>
> > >> Should this patch be moved to the "critical fixes for v6.0" series?
> > >
> > > I think not because it does not risk stability of the
> > > kernel itself. It's "nice to have" but not mandatory.
> >
> > ok, thank you for considering it.
> >
> > ...
> >
> > >>> @@ -453,16 +454,30 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900)
> > >>> modt_ioc.offset = heap->offset;
> > >>> modt_ioc.length = heap->size;
> > >>> modt_ioc.page_type = SGX_PAGE_TYPE_TRIM;
> > >>> -
> > >>> + count = 0;
> > >>> TH_LOG("Changing type of %zd bytes to trimmed may take a while ...",
> > >>> heap->size);
> > >>> - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc);
> > >>> - errno_save = ret == -1 ? errno : 0;
> > >>> + do {
> > >>> + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc);
> > >>> +
> > >>> + errno_save = ret == -1 ? errno : 0;
> > >>> + if (errno_save != EAGAIN)
> > >>> + break;
> > >>> +
> > >>> + EXPECT_EQ(modt_ioc.result, 0);
> > >>
> > >> If this check triggers then there is something seriously wrong and in that case
> > >> it may also be that this loop may be unable to terminate or the error condition would
> > >> keep appearing until the loop terminates (which may be many iterations). Considering
> > >> the severity and risk I do think that ASSERT_EQ() would be more appropriate,
> > >> similar to how ASSERT_EQ() is used in patch 5/5.
> > >>
> > >> Apart from that I think that this looks good.
> > >>
> > >> Thank you very much for adding this.
> > >>
> > >> Reinette
> > >
> > > Hmm... I could along the lines:
> > >
> > > /*
> > > * Get time since Epoch is milliseconds.
> > > */
> > > unsigned long get_time(void)
> > > {
> > > struct timeval start;
> > >
> > > gettimeofday(&start, NULL);
> > >
> > > return (unsigneg long)start.tv_sec * 1000L + (unsigned long)start.tv_usec / 1000L;
> > > }
> > >
> > > and
> > >
> > > #define IOCTL_RETRY_TIMEOUT 100
> > >
> > > In the test function:
> > >
> > > unsigned long start_time;
> > >
> > > /* ... */
> > >
> > > start_time = get_time();
> > > do {
> > > EXPECT_LT(get_time() - start_time(), IOCTL_RETRY_TIMEOUT);
> > >
> > > /* ... */
> > > }
> > >
> > > /* ... */
> > >
> > > What do you think?
> >
> > I do think that your proposal can be considered for an additional check in this
> > test but the way I understand it it does not address my feedback.
> >
> > In this patch the flow is:
> >
> > do {
> > ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc);
> >
> > errno_save = ret == -1 ? errno : 0;
> > if (errno_save != EAGAIN)
> > break;
> >
> > EXPECT_EQ(modt_ioc.result, 0);
> > ...
> > } while ...
> >
> >
> > If this EXPECT_EQ() check fails then it means that errno_save is EAGAIN
> > and modt_ioc.result != 0. This should never happen because in the kernel
> > (sgx_enclave_modify_types()) the only time modt_ioc.result can be set is
> > when the ioctl() returns EFAULT.
> >
> > In my opinion this check should be changed to:
> > ASSERT_EQ(modt_ioc.result, 0);
>
> Right, I missed this. It should be definitely ASSERT_EQ(().
I was thinking to add patch, which adds helper to calculate static
content length from the last item of the segment table (offset + size)
and replace total_length calculations in various tests.
I won't send a new version this week because I'm at Open Source Summit
EU and Linux Security Summit EU.
BR, Jarkko