Re: [PATCH v2 1/5] selftests/sgx: Retry the ioctl()'s returned with EAGAIN

From: Reinette Chatre
Date: Thu Sep 08 2022 - 20:07:11 EST


Hi Jarkko,

On 9/8/2022 4:19 PM, Jarkko Sakkinen wrote:
> On Thu, Sep 08, 2022 at 03:43:06PM -0700, Reinette Chatre wrote:
>> Hi Jarkko and Haitao,
>>
>> On 9/4/2022 7:04 PM, Jarkko Sakkinen wrote:
>>> From: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
>>>
>>> For EMODT and EREMOVE ioctl()'s with a large range, kernel
>>> may not finish in one shot and return EAGAIN error code
>>> and count of bytes of EPC pages on that operations are
>>> finished successfully.
>>>
>>> Change the unclobbered_vdso_oversubscribed_remove test
>>> to rerun the ioctl()'s in a loop, updating offset and length
>>> using the byte count returned in each iteration.
>>>
>>> Fixes: 6507cce561b4 ("selftests/sgx: Page removal stress test")
>>
>> Should this patch be moved to the "critical fixes for v6.0" series?
>
> I think not because it does not risk stability of the
> kernel itself. It's "nice to have" but not mandatory.

ok, thank you for considering it.

...

>>> @@ -453,16 +454,30 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900)
>>> modt_ioc.offset = heap->offset;
>>> modt_ioc.length = heap->size;
>>> modt_ioc.page_type = SGX_PAGE_TYPE_TRIM;
>>> -
>>> + count = 0;
>>> TH_LOG("Changing type of %zd bytes to trimmed may take a while ...",
>>> heap->size);
>>> - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc);
>>> - errno_save = ret == -1 ? errno : 0;
>>> + do {
>>> + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc);
>>> +
>>> + errno_save = ret == -1 ? errno : 0;
>>> + if (errno_save != EAGAIN)
>>> + break;
>>> +
>>> + EXPECT_EQ(modt_ioc.result, 0);
>>
>> If this check triggers then there is something seriously wrong and in that case
>> it may also be that this loop may be unable to terminate or the error condition would
>> keep appearing until the loop terminates (which may be many iterations). Considering
>> the severity and risk I do think that ASSERT_EQ() would be more appropriate,
>> similar to how ASSERT_EQ() is used in patch 5/5.
>>
>> Apart from that I think that this looks good.
>>
>> Thank you very much for adding this.
>>
>> Reinette
>
> Hmm... I could along the lines:
>
> /*
> * Get time since Epoch is milliseconds.
> */
> unsigned long get_time(void)
> {
> struct timeval start;
>
> gettimeofday(&start, NULL);
>
> return (unsigneg long)start.tv_sec * 1000L + (unsigned long)start.tv_usec / 1000L;
> }
>
> and
>
> #define IOCTL_RETRY_TIMEOUT 100
>
> In the test function:
>
> unsigned long start_time;
>
> /* ... */
>
> start_time = get_time();
> do {
> EXPECT_LT(get_time() - start_time(), IOCTL_RETRY_TIMEOUT);
>
> /* ... */
> }
>
> /* ... */
>
> What do you think?

I do think that your proposal can be considered for an additional check in this
test but the way I understand it it does not address my feedback.

In this patch the flow is:

do {
ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc);

errno_save = ret == -1 ? errno : 0;
if (errno_save != EAGAIN)
break;

EXPECT_EQ(modt_ioc.result, 0);
...
} while ...


If this EXPECT_EQ() check fails then it means that errno_save is EAGAIN
and modt_ioc.result != 0. This should never happen because in the kernel
(sgx_enclave_modify_types()) the only time modt_ioc.result can be set is
when the ioctl() returns EFAULT.

In my opinion this check should be changed to:
ASSERT_EQ(modt_ioc.result, 0);

This is my opinion because this check indicates a kernel bug and I do
not see value in continuing the test after a kernel bug is encountered.
My expectation is that this test is of value to folks modifying
the kernel code.

As for the new check you are proposing - it seems to me that kselftest
with TIMEOUT_DEFAULT already covers this.

Reinette