Re: [PATCH v12 14/14] selftests/sgx: Add scripts for EPC cgroup testing

From: Haitao Huang
Date: Tue Apr 16 2024 - 10:55:11 EST


On Tue, 16 Apr 2024 09:10:12 -0500, Jarkko Sakkinen <jarkko@xxxxxxxxxx> wrote:

On Tue Apr 16, 2024 at 5:05 PM EEST, Jarkko Sakkinen wrote:
On Tue Apr 16, 2024 at 6:20 AM EEST, Haitao Huang wrote:
> With different cgroups, the script starts one or multiple concurrent SGX
> selftests (test_sgx), each to run the unclobbered_vdso_oversubscribed
> test case, which loads an enclave of EPC size equal to the EPC capacity
> available on the platform. The script checks results against the
> expectation set for each cgroup and reports success or failure.
>
> The script creates 3 different cgroups at the beginning with following
> expectations:
>
> 1) SMALL - intentionally small enough to fail the test loading an
> enclave of size equal to the capacity.
> 2) LARGE - large enough to run up to 4 concurrent tests but fail some if
> more than 4 concurrent tests are run. The script starts 4 expecting at
> least one test to pass, and then starts 5 expecting at least one test
> to fail.
> 3) LARGER - limit is the same as the capacity, large enough to run lots of
> concurrent tests. The script starts 8 of them and expects all pass.
> Then it reruns the same test with one process randomly killed and
> usage checked to be zero after all processes exit.
>
> The script also includes a test with low mem_cg limit and LARGE sgx_epc
> limit to verify that the RAM used for per-cgroup reclamation is charged
> to a proper mem_cg. For this test, it turns off swapping before start,
> and turns swapping back on afterwards.
>
> Add README to document how to run the tests.
>
> Signed-off-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>

jarkko@mustatorvisieni:~/linux-tpmdd> sudo make -C tools/testing/selftests/sgx run_tests
make: Entering directory '/home/jarkko/linux-tpmdd/tools/testing/selftests/sgx'
gcc -Wall -Werror -g -I/home/jarkko/linux-tpmdd/tools/testing/selftests/../../../tools/include -fPIC -c main.c -o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/main.o
gcc -Wall -Werror -g -I/home/jarkko/linux-tpmdd/tools/testing/selftests/../../../tools/include -fPIC -c load.c -o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/load.o
gcc -Wall -Werror -g -I/home/jarkko/linux-tpmdd/tools/testing/selftests/../../../tools/include -fPIC -c sigstruct.c -o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/sigstruct.o
gcc -Wall -Werror -g -I/home/jarkko/linux-tpmdd/tools/testing/selftests/../../../tools/include -fPIC -c call.S -o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/call.o
gcc -Wall -Werror -g -I/home/jarkko/linux-tpmdd/tools/testing/selftests/../../../tools/include -fPIC -c sign_key.S -o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/sign_key.o
gcc -Wall -Werror -g -I/home/jarkko/linux-tpmdd/tools/testing/selftests/../../../tools/include -fPIC -o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/test_sgx /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/main.o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/load.o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/sigstruct.o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/call.o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/sign_key.o -z noexecstack -lcrypto
gcc -Wall -Werror -static-pie -nostdlib -ffreestanding -fPIE -fno-stack-protector -mrdrnd -I/home/jarkko/linux-tpmdd/tools/testing/selftests/../../../tools/include test_encl.c test_encl_bootstrap.S -o /home/jarkko/linux-tpmdd/tools/testing/selftests/sgx/test_encl.elf -Wl,-T,test_encl.lds,--build-id=none
/usr/lib64/gcc/x86_64-suse-linux/13/../../../../x86_64-suse-linux/bin/ld: warning: /tmp/ccqvDJVg.o: missing .note.GNU-stack section implies executable stack
/usr/lib64/gcc/x86_64-suse-linux/13/../../../../x86_64-suse-linux/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
TAP version 13
1..2
# timeout set to 45
# selftests: sgx: test_sgx
# TAP version 13
# 1..16
# # Starting 16 tests from 1 test cases.
# # RUN enclave.unclobbered_vdso ...
# # OK enclave.unclobbered_vdso
# ok 1 enclave.unclobbered_vdso
# # RUN enclave.unclobbered_vdso_oversubscribed ...
# # OK enclave.unclobbered_vdso_oversubscribed
# ok 2 enclave.unclobbered_vdso_oversubscribed
# # RUN enclave.unclobbered_vdso_oversubscribed_remove ...
# # main.c:402:unclobbered_vdso_oversubscribed_remove:Creating an enclave with 98566144 bytes heap may take a while ...
# # main.c:457:unclobbered_vdso_oversubscribed_remove:Changing type of 98566144 bytes to trimmed may take a while ...
# # main.c:473:unclobbered_vdso_oversubscribed_remove:Entering enclave to run EACCEPT for each page of 98566144 bytes may take a while ...
# # main.c:494:unclobbered_vdso_oversubscribed_remove:Removing 98566144 bytes from enclave may take a while ...
# # OK enclave.unclobbered_vdso_oversubscribed_remove
# ok 3 enclave.unclobbered_vdso_oversubscribed_remove
# # RUN enclave.clobbered_vdso ...
# # OK enclave.clobbered_vdso
# ok 4 enclave.clobbered_vdso
# # RUN enclave.clobbered_vdso_and_user_function ...
# # OK enclave.clobbered_vdso_and_user_function
# ok 5 enclave.clobbered_vdso_and_user_function
# # RUN enclave.tcs_entry ...
# # OK enclave.tcs_entry
# ok 6 enclave.tcs_entry
# # RUN enclave.pte_permissions ...
# # OK enclave.pte_permissions
# ok 7 enclave.pte_permissions
# # RUN enclave.tcs_permissions ...
# # OK enclave.tcs_permissions
# ok 8 enclave.tcs_permissions
# # RUN enclave.epcm_permissions ...
# # OK enclave.epcm_permissions
# ok 9 enclave.epcm_permissions
# # RUN enclave.augment ...
# # OK enclave.augment
# ok 10 enclave.augment
# # RUN enclave.augment_via_eaccept ...
# # OK enclave.augment_via_eaccept
# ok 11 enclave.augment_via_eaccept
# # RUN enclave.tcs_create ...
# # OK enclave.tcs_create
# ok 12 enclave.tcs_create
# # RUN enclave.remove_added_page_no_eaccept ...
# # OK enclave.remove_added_page_no_eaccept
# ok 13 enclave.remove_added_page_no_eaccept
# # RUN enclave.remove_added_page_invalid_access ...
# # OK enclave.remove_added_page_invalid_access
# ok 14 enclave.remove_added_page_invalid_access
# # RUN enclave.remove_added_page_invalid_access_after_eaccept ...
# # OK enclave.remove_added_page_invalid_access_after_eaccept
# ok 15 enclave.remove_added_page_invalid_access_after_eaccept
# # RUN enclave.remove_untouched_page ...
# # OK enclave.remove_untouched_page
# ok 16 enclave.remove_untouched_page
# # PASSED: 16 / 16 tests passed.
# # Totals: pass:16 fail:0 xfail:0 xpass:0 skip:0 error:0
ok 1 selftests: sgx: test_sgx
# timeout set to 45
# selftests: sgx: run_epc_cg_selftests.sh
# # Setting up limits.
# ./run_epc_cg_selftests.sh: line 50: echo: write error: Invalid argument
# # Failed setting up misc limits.
not ok 2 selftests: sgx: run_epc_cg_selftests.sh # exit=1
make: Leaving directory

This means no sgx cgroup turned on. (echoing sgx_epc entries into misc.max not allowed)
v12 removed the need for config CGROUP_SGX_EPC.
Did you by chance running on a previous kernel build without the sgx cgroup configured?

I did declare the configs in the config file but I missed it in my patch as stated earlier. IIUC, that would not cause this error though.

Maybe I should exit with the skip code if no CGROUP_MISC (no more CGROUP_SGX_EPC) is configured?

'/home/jarkko/linux-tpmdd/tools/testing/selftests/sgx'

This is what happens now.

BTW, I noticed a file that should not exist, i.e. README. Only thing
that should exist is the tests for kselftest and anything else should
not exist at all, so this file by definiton should not exist.

Could you point me to ths rule?

I felt some instructions needed as tests getting more complex, and was following examples:

tools/testing/selftests$ find . -name README
/futex/README
/tc-testing/README
/net/forwarding/README
/powerpc/nx-gzip/README
/ftrace/README
/arm64/signal/README
/arm64/fp/README
/arm64/README
/zram/README
/livepatch/README
/resctrl/README

I'd suggest to sanity-check the kselftest with a person from Intel who
has worked with kselftest before the next version so that it will be
nailed next time. Or better internal review this single patch with a
person with expertise on kernel QA.

I'll double check.

I did not check this but I have also suspicion that it might have some
checks whetehr it is run as root or not. If there are any, those should
be removed too. Let people set their environment however want...

Do you mean this part?

+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+if [ "$(id -u)" -ne 0 ]; then
+ echo "SKIP: SGX Cgroup tests need root privileges."
+ exit $ksft_skip
+fi

I saw lots of similar code reported when I ran following in the selftests directory:

tools/testing/selftests$ grep -C 5 -r "root" */*.sh

Thanks
Haitao