Re: [PATCH v8 07/15] x86/sgx: Expose sgx_reclaim_pages() for cgroup

From: Haitao Huang
Date: Tue Jan 30 2024 - 20:35:37 EST


On Tue, 30 Jan 2024 09:39:44 -0600, Huang, Kai <kai.huang@xxxxxxxxx> wrote:

+ * @lru: The LRU from which pages are reclaimed.
+ * @nr_to_scan: Pointer to the target number of pages to scan, must be less
than
+ * SGX_NR_TO_SCAN.
+ * Return: Number of pages reclaimed.
*/
-static void sgx_reclaim_pages(void)
+unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned
+int *nr_to_scan)

Since the function is now returning the number of reclaimed pages, why do you need to make the @nr_to_scan as pointer?

Cannot the caller just adjust @nr_to_scan when calling this function based on how many pages have reclaimed?

I am not even sure whether you need @nr_to_scan at all because as we discussed I think it's just extremely rare you need to pass "< SGX_NR_TO_SCAN" to this function.

Even if you need, you can always choose to try to reclaim SGX_NR_TO_SCAN pages.


I tested with that approach and found we can only target number of pages attempted to reclaim not pages actually reclaimed due to the uncertainty of how long it takes to reclaim pages. Besides targeting number of scanned pages for each cycle is also what the ksgxd does.

If we target actual number of pages, sometimes it just takes too long. I saw more timeouts with the default time limit when running parallel selftests.

We also need return number of pages actually reclaimed as indication to caller whether we actually reclaimed any pages at all. The caller, e.g., sgx_epc_cg_try_charge(), then can decide to schedule() or continue next step which usually is allocation of the page.

[...]


+static void sgx_reclaim_pages_global(void) {
+ unsigned int nr_to_scan = SGX_NR_TO_SCAN;
+
+ sgx_reclaim_pages(&sgx_global_lru, &nr_to_scan); }
+

I think this function doesn't look sane at all when you have @nr_to_scan being a pointer?


You will see the pointer being used later for cgroup worker.

I am also not sure whether this function is needed -- if we don't add @nr_to_scan to sgx_reclaim_pages(), then this function is basically:

sgx_reclaim_pages(&sgx_global_lru);


As indicated in the commit message, this wrapper is getting ready for doing global reclamation from root cgroup. You will see it changed later.


Thanks
Haitao