Re: [PATCH v6 09/12] x86/sgx: Restructure top-level EPC reclaim function

From: Haitao Huang
Date: Fri Jan 12 2024 - 12:07:32 EST


On Sun, 17 Dec 2023 19:44:56 -0600, Huang, Kai <kai.huang@xxxxxxxxx> wrote:


>
> The point is, with or w/o this patch, you can only reclaim 16 EPC pages
> in one
> function call (as you have said you are going to remove
> SGX_NR_TO_SCAN_MAX,
> which is a cipher to both of us). The only difference I can see is,
> with this
> patch, you can have multiple calls of "isolate" and then call the
> "do_reclaim"
> once.
>
> But what's the good of having the "isolate" if the "do_reclaim" can only
> reclaim
> 16 pages anyway?
>
> Back to my last reply, are you afraid of any LRU has less than 16 pages
> to
> "isolate", therefore you need to loop LRUs of descendants to get 16?
> Cause I
> really cannot think of any other reason why you are doing this.
>
>

I think I see your point. By capping pages reclaimed per cycle to 16,
there is not much difference even if those 16 pages are spread in separate
LRUs . The difference is only significant when we ever raise that cap. To
preserve the current behavior of ewb loops independent on number of LRUs
to loop through for each reclaiming cycle, regardless of the exact value
of the page cap, I would still think current approach in the patch is
reasonable choice. What do you think?

To me I won't bother to do that. Having less than 16 pages in one LRU is
*extremely rare* that should never happen in practice. It's pointless to make
such code adjustment at this stage.

Let's focus on enabling functionality first. When you have some real
performance issue that is related to this, we can come back then.


I have done some rethinking about this and realize this does save quite some significant work: without breaking out isolation part from sgx_reclaim_pages(), I can remove the changes to use a list for isolated pages, and no need to introduce "state" such as RECLAIM_IN_PROGRESS. About 1/3 of changes for per-cgroup reclamation will be gone.

So I think I'll go this route now. The only downside may be performance if a enclave spreads its pages in different cgroups and even that is minimum impact as we limit reclamation to 16 pages a time. Let me know if someone feel strongly we need dealt with that and see some other potential issues I may have missed.

Thanks

Haitao