On 18/06/2024 12:53 am, Huang, Haitao wrote:
From: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
Previous patches have implemented all infrastructure needed for
per-cgroup EPC page tracking and reclaiming. But all reclaimable EPC
pages are still tracked in the global LRU as sgx_epc_page_lru() always
returns reference to the global LRU.
Change sgx_epc_page_lru() to return the LRU of the cgroup in which the
given EPC page is allocated.
This makes all EPC pages tracked in per-cgroup LRUs and the global
reclaimer (ksgxd) will not be able to reclaim any pages from the global
LRU. However, in cases of over-committing, i.e., the sum of cgroup
limits greater than the total capacity, cgroups may never reclaim but
the total usage can still be near the capacity. Therefore a global
reclamation is still needed in those cases and it should be performed
from the root cgroup.
Modify sgx_reclaim_pages_global(), to reclaim from the root EPC cgroup
when cgroup is enabled. Similar to sgx_cgroup_reclaim_pages(), return
the next cgroup so callers can use it as the new starting node for next
round of reclamation if needed.
Also update sgx_can_reclaim_global(), to check emptiness of LRUs of all
cgroups when EPC cgroup is enabled, otherwise only check the global LRU.
Finally, change sgx_reclaim_direct(), to check and ensure there are free
pages at cgroup level so forward progress can be made by the caller.
Reading above, it's not clear how the _new_ global reclaim works with
multiple LRUs.
E.g., the current global reclaim essentially treats all EPC pages equally
when scanning those pages. From the above, I don't see how this is
achieved in the new global reclaim.
The changelog should:
1) describe the how does existing global reclaim work, and then describe
how to achieve the same beahviour in the new global reclaim which works
with multiple LRUs;
2) If there's any behaviour difference between the "existing" vs the "new"
global reclaim, the changelog should point out the difference, and explain
why such difference is OK.
With these changes, the global reclamation and per-cgroup reclamation
both work properly with all pages tracked in per-cgroup LRUs.
[...]
-static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
+static struct misc_cg *sgx_reclaim_pages_global(struct misc_cg *next_cg,
+ struct mm_struct *charge_mm)
{
+ if (IS_ENABLED(CONFIG_CGROUP_MISC))
+ return sgx_cgroup_reclaim_pages(misc_cg_root(), next_cg, charge_mm);
+
sgx_reclaim_pages(&sgx_global_lru, charge_mm);
+ return NULL;
}
/*
@@ -414,12 +443,35 @@ static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
*/
void sgx_reclaim_direct(void)
{
+ struct sgx_cgroup *sgx_cg = sgx_get_current_cg();
+ struct misc_cg *cg = misc_from_sgx(sgx_cg);
From below @sgx_cg could be NULL. It's not immediately clear whether calling
misc_from_sgx(sgx_cg) unconditionally is safe here.
Leave the initiaization of @cg to a later phase where @sgx_cg is
guaranteed not being NULL, or initialize @cg to NULL here and update later.
+ struct misc_cg *next_cg = NULL;
+
+ /*
+ * Make sure there are some free pages at both cgroup and global levels.
+ * In both cases, only make one attempt of reclamation to avoid lengthy
+ * block on the caller.
+ */
+ if (sgx_cg && sgx_cgroup_should_reclaim(sgx_cg))
+ next_cg = sgx_cgroup_reclaim_pages(cg, next_cg, current->mm);
I don't quite follow the logic.
First of all, sgx_cgroup_reclaim_pages() isn't called in a loop, so why
not just do:
next_cg = sgx_cgroup_reclaim_pages(cg, NULL, current->mm);
And what is the point of set @next_cg here, since ...
+
+ if (next_cg != cg)
+ put_misc_cg(next_cg);
+
+ next_cg = NULL;
... here @next_cg is reset to NULL ?
Looks the only reason is you need to do ...
put_misc_cg(next_cg);
... above?
This piece of code appears repeatedly in this file. Is there any way we
can get rid of it?
if (sgx_should_reclaim_global(SGX_NR_LOW_PAGES))
- sgx_reclaim_pages_global(current->mm);
+ next_cg = sgx_reclaim_pages_global(next_cg, current->mm);
And this doesn't seems "global reclaim" at all?
Because it essentially equals to:
next_cg = sgx_reclaim_pages_global(NULL, current->mm);
which always reclaims from the ROOT.
So each call to sgx_reclaim_direct() will always reclaim from the ROOT --
any other LRUs in the hierarchy will unlikely to get any chance to be
reclaimed.
+
+ if (next_cg != misc_cg_root())
+ put_misc_cg(next_cg);
+
+ sgx_put_cg(sgx_cg);
}
static int ksgxd(void *p)
{
+ struct misc_cg *next_cg = NULL;
+
set_freezable();
/*
@@ -437,11 +489,15 @@ static int ksgxd(void *p)
kthread_should_stop() ||
sgx_should_reclaim_global(SGX_NR_HIGH_PAGES));
- if (sgx_should_reclaim_global(SGX_NR_HIGH_PAGES))
+ while (!kthread_should_stop() && sgx_should_reclaim_global(SGX_NR_HIGH_PAGES)) {
/* Indirect reclaim, no mm to charge, so NULL: */
- sgx_reclaim_pages_global(NULL);
+ next_cg = sgx_reclaim_pages_global(next_cg, NULL);
+ cond_resched();
Should the 'put_misc_cg(next_cg)' be done within the while() loop but not
below?
Hopefully adding comments mentioned above clarifies this.+ }
- cond_resched();
+ if (next_cg != misc_cg_root())
+ put_misc_cg(next_cg);
+ next_cg = NULL;
Again, it doesn't seems "global reclaim" here, since you always restart
from the ROOT once the target pages have been reclaimed.
AFAICT it's completely different from the existing global reclaim.
}
return 0;
@@ -583,6 +639,7 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page)
*/
struct sgx_epc_page *sgx_alloc_epc_page(void *owner, enum sgx_reclaim reclaim)
{
+ struct misc_cg *next_cg = NULL;
struct sgx_cgroup *sgx_cg;
struct sgx_epc_page *page;
int ret;
@@ -616,10 +673,19 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, enum sgx_reclaim reclaim)
break;
}
- sgx_reclaim_pages_global(current->mm);
+ /*
+ * At this point, the usage within this cgroup is under its
+ * limit but there is no physical page left for allocation.
+ * Perform a global reclaim to get some pages released from any
+ * cgroup with reclaimable pages.
+ */
+ next_cg = sgx_reclaim_pages_global(next_cg, current->mm);
cond_resched();
}
Ditto IIUC.