Re: [RFC PATCH 00/20] Add Cgroup support for SGX EPC memory

From: Tejun Heo
Date: Fri Sep 23 2022 - 20:11:40 EST


Hello,

On Thu, Sep 22, 2022 at 02:03:52PM -0700, Dave Hansen wrote:
> On 9/22/22 12:08, Tejun Heo wrote:
> > Can you please give more concrete examples? I'd love to hear how the SGX EPC
> > memory is typically used in what amounts and what's the performance
> > implications when they get reclaimed and so on. ie. Please describe a
> > realistic usage scenario of contention with sufficient details on how the
> > system is set up, what the applications are using the SGX EPC memory for and
> > how much, how the contention on memory affects the users and so on.
>
> One wrinkle is that the apps that use SGX EPC memory are *normal* apps.
> There are frameworks that some folks are very excited about that allow
> you to run mostly unmodified app stacks inside SGX. For example:
>
> https://github.com/gramineproject/graphene
>
> In fact, Gramine users are the troublesome ones for overcommit. Most
> explicitly-written SGX applications are quite austere in their SGX
> memory use; they're probably never going to see overcommit. These
> Gramine-wrapped apps are (relative) pigs. They've been the ones finding
> bugs in the existing SGX overcommit code.
>
> So, where does all the SGX memory go? It's the usual suspects:
> memcached and redis. ;)

Hey, so, I'm a bit weary that this doesn't seem to have a strong demand at
this point. When there's clear shared demand, I usually hear from multiple
parties about their use cases and the practical problems they're trying to
solve and so on. This, at least to me, seems primarily driven by producers
than consumers.

There's nothing wrong with projecting future usages and jumping ahead the
curve but there's a balance to hit, and going full-on memcg-style controller
with three control knobs seems to be jumping the gun and may create
commitments which we end up looking back on with a bit of regret.

Given that, how about this? We can easily add the functionality of .max
through the misc controller. Add a new key there, trycharge when allocating
new memory, if fails, try reclaim and then fail allocation if reclaim fails
hard enough. I belive that should give at least a reasonable place to start
especially given that memcg only had limits with similar semantics for quite
a while at the beginning.

That way, we avoid creating a big interface commitments while providing a
feature which should be able to serve and test out the immediate usecases.
If, for some reason, many of us end up running hefty applications in SGX, we
can revisit the issue and build up something more complete with provisions
for backward compatibility.

Thanks.

--
tejun