Re: [PATCH v6 09/12] x86/sgx: Restructure top-level EPC reclaim function
From: Michal Koutný
Date: Thu Jan 04 2024 - 07:38:55 EST
Hello.
On Mon, Dec 18, 2023 at 03:24:40PM -0600, Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx> wrote:
> Thanks for raising this. Actually my understanding the above discussion was
> mainly about not doing reclaiming by killing enclaves, i.e., I assumed
> "reclaiming" within that context only meant for that particular kind.
>
> As Mikko pointed out, without reclaiming per-cgroup, the max limit of each
> cgroup needs to accommodate the peak usage of enclaves within that cgroup.
> That may be inconvenient for practical usage and limits could be forced to
> be set larger than necessary to run enclaves performantly. For example, we
> can observe following undesired consequences comparing a system with current
> kernel loaded with enclaves whose total peak usage is greater than the EPC
> capacity.
>
> 1) If a user wants to load the same exact enclaves but in separate cgroups,
> then the sum of cgroup limits must be higher than the capacity and the
> system will end up doing the same old global reclaiming as it is currently
> doing. Cgroup is not useful at all for isolating EPC consumptions.
That is the use of limits to prevent a runaway cgroup smothering the
system. Overcommited values in such a config are fine because the more
simultaneous runaways, the less likely.
The peak consumption is on the fair expense of others (some efficiency)
and the limit contains the runaway (hence the isolation).
> 2) To isolate impact of usage of each cgroup on other cgroups and yet still
> being able to load each enclave, the user basically has to carefully plan to
> ensure the sum of cgroup max limits, i.e., the sum of peak usage of
> enclaves, is not reaching over the capacity. That means no over-commiting
> allowed and the same system may not be able to load as many enclaves as with
> current kernel.
Another "config layout" of limits is to achieve partitioning (sum ==
capacity). That is perfect isolation but it naturally goes against
efficient utilization. The way other controllers approach this trade-off
is with weights (cpu, io) or protections (memory). I'm afraid misc
controller is not ready for this.
My opinion is to start with the simple limits (first patches) and think
of prioritization/guarantee mechanism based on real cases.
HTH,
Michal
Attachment:
signature.asc
Description: PGP signature