Re: [PATCH v5 12/18] x86/sgx: Add EPC OOM path to forcefully reclaim EPC

From: Haitao Huang
Date: Tue Oct 17 2023 - 08:58:13 EST


On Mon, 16 Oct 2023 20:34:57 -0500, Huang, Kai <kai.huang@xxxxxxxxx> wrote:

On Mon, 2023-10-16 at 19:10 -0500, Haitao Huang wrote:
On Mon, 16 Oct 2023 16:09:52 -0500, Huang, Kai <kai.huang@xxxxxxxxx> wrote:
[...]

> still need to fix the bug mentioned above here.
>
> I really think you should just go this simple way:
>
> When you want to take EPC back from VM, kill the VM.
>

My only concern is that this is a compromise due to current limitation (no
other sane way to take EPC from VMs). If we define this behavior and it
becomes a contract to user space, then we can't change in future.

Why do we need to "define such behaviour"?

This isn't some kinda of kernel/userspace ABI IMHO, but only kernel internal
implementation. Here VM is similar to normal host enclaves. You limit the
resource, some host enclaves could be killed. Similarly, VM could also be
killed too.

And supporting VMM EPC oversubscription doesn't mean VM won't be killed. The VM
can still be a target to kill after VM's all EPC pages have been swapped out.


On the other hand, my understanding the reason you want this behavior is
to enforce EPC limit at runtime.

No I just thought this is a bug/issue needs to be fixed. If anyone believes
this is not a bug/issue then it's a separate discussion.


AFAIK, before we introducing max_write() callback in this series, no misc controller would possibly enforce the limit when misc.max is reduced. e.g. I don't think CVMs be killed when ASID limit is reduced and the cgroup was full before limit is reduced.

I think EPC pages to VMs could have the same behavior, once they are given to a guest, never taken back by the host. For enclaves on host side, pages are reclaimable, that allows us to enforce in a similar way to memcg.

Thanks
Haitao