Re: SGX vs LSM (Re: [PATCH v20 00/28] Intel SGX1 support)

From: Sean Christopherson
Date: Wed May 22 2019 - 11:41:20 EST


On Wed, May 22, 2019 at 09:56:30AM -0400, Stephen Smalley wrote:
> On 5/22/19 9:22 AM, Jarkko Sakkinen wrote:
> >On Wed, May 22, 2019 at 04:20:22PM +0300, Jarkko Sakkinen wrote:
> >>On Tue, May 21, 2019 at 08:51:40AM -0700, Sean Christopherson wrote:
> >>>Except that mmap() is more or less required to guarantee that ELRANGE
> >>>established by ECREATE is available. And we want to disallow mmap() as
> >>>soon as the first EADD is done so that userspace can't remap the enclave's
> >>>VMAs via munmap()->mmap() and gain execute permissions to pages that were
> >>>EADD'd as NX.
> >>
> >>We don't want to guarantee such thing and it is not guaranteed. It does
> >>not fit at all to the multi process work done. Enclaves are detached
> >>from any particular process addresse spaces. It is responsibility of
> >>process to open windows to them.
> >>
> >>That would be completely against work that we've done lately.
> >
> >Example use case: you have a process that just constructs an enclave
> >and sends it to another process or processes for use. The constructor
> >process could have basically anything on that range. This was the key
> >goal of the fd based enclave work.
>
> What exactly happens in the constructor versus the recipient processes?
> Which process performs each of the necessary open(), mmap(), and ioctl()
> calls for setting up the enclave? Can you provide a high level overview of
> the sequence of userspace calls by the constructor and by the recipient
> similar to what Sean showed earlier for just a single process?

Hmm, what we had talked about was allowing the SGX ioctls to work without
an associated VMA, with the end goal of letting userspace restrict access
to /dev/sgx/enclave. Very roughly...

Enclave Owner:

connect(builder, ...);
send(builder, "/home/sean/path/to/my/enclave");

recv(builder, &enclave_fd);

for_each_chunk {
mmap(enclave_addr + offset, size, ..., MAP_SHARED, enclave_fd, 0);
}


Enclave Builder:

recv(sock, &enclave_path);

source_fd = open(enclave_path, O_RDONLY);
for_each_chunk {
<hand waving - mmap()/mprotect() the enclave file into regular memory>
}

enclave_fd = open("/dev/sgx/enclave", O_RDWR);

ioctl(enclave_fd, ENCLAVE_CREATE, ...);
for_each_chunk {
struct sgx_enclave_add ioctlargs = {
.offset = chunk.offset,
.source = chunk.addr,
.size = chunk.size,
.type = chunk.type, /* SGX specific metadata */
}
ioctl(fd, ENCLAVE_ADD, &ioctlargs); /* modifies enclave's VMAs */
}
ioctl(enclave_fd, ENCLAVE_INIT, ...);

write(sock, enclave_fd);


But the above flow is flawed because there'a catch-22: ENCLAVE_ECREATE
takes the virtual address of the enclave, but in the above flow that's
not established until "mmap(..., enclave_fd)". And because an enclave's
virtual range needs to be naturally aligned (hardware requirements), the
enclave owner would need to do something like:

source_fd = open("/home/sean/path/to/my/enclave", O_RDONLY);
size = <parse size from source_fd>

enclave_range = mmap(NULL, size*2, PROT_READ, ???, NULL, 0);
enclave_addr = (enclave_range + (size - 1)) & ~(size - 1);

connect(builder, ...);
send(builder, {"/home/sean/path/to/my/enclave", enclave_addr});

recv(builder, &enclave_fd);

munmap(enclave_range);

for_each_chunk {
addr = mmap(enclave_addr + c.offset, c.size, ..., MAP_SHARED, enclave_fd, 0);
if (addr != enclave_addr + c.offset)
exit(1);
}

And that straight up doesn't work with the v20 driver because mmap() with
the enclave_fd will run through sgx_get_unmapped_area(), which also does
the natural alignment adjustments (the idea being that mmap() is mapping
the entire enclave). E.g. mmap() will map the wrong address if the offset
of a chunk is less than its size due to the driver adjusting the address.

Eliminating sgx_get_unmapped_area() means userspace is once again on the
hook for naturally aligning the enclave, which is less than desirable.

Looking back at the original API discussions around a builder process[1],
we never fleshed out the end-to-end flow. While having a builder process
*sounds* reasonable, in practice it adds a lot of complexity without
providing much in the way of added security. E.g. in addition to the
above mmap() issues, since the order of EADDs affects the enclave
measurement, the enclave owner would need to communicate the exact steps
to build the enclave, or the builder would need a priori knowledge of the
enclave format.

Userspace can still restrict access to /dev/sgx/enclave, e.g. by having a
daemon that requires additional credentials to obtain a new enclave_fd.
So AFAICT, the only benefit to having a dedicated builder is that it can
do its own whitelisting of enclaves, but since we're trending towards
supporting whitelisting enclaves in the kernel, e.g. via sigstruct,
whitelisting in userspace purely in userspace also provides marginal value.

TL;DR: Requiring VMA backing to build an enclave seems reasonable and sane.

[1] https://lkml.kernel.org/r/CALCETrX+KisMCbptrnPSO79-YF4E3nR1XHt+a7hCs1GXsxAbtw@xxxxxxxxxxxxxx