Re: [PATCH v3 03/25] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()

From: Jarkko Sakkinen
Date: Wed Mar 24 2021 - 05:28:08 EST


On Tue, Mar 23, 2021 at 05:06:04PM +0100, Borislav Petkov wrote:
> On Tue, Mar 23, 2021 at 03:45:14PM +0000, Sean Christopherson wrote:
> > Practically speaking, "basic" deployments of SGX VMs will be insulated from
> > this bug. KVM doesn't support EPC oversubscription, so even if all EPC is
> > exhausted, new VMs will fail to launch, but existing VMs will continue to chug
> > along with no ill effects....
>
> Ok, so it sounds to me like *at* *least* there should be some writeup in
> Documentation/ explaining to the user what to do when she sees such an
> EREMOVE failure, perhaps the gist of this thread and then possibly the
> error message should point to that doc.
>
> We will of course have to revisit when this hits the wild and people
> start (or not) hitting this. But judging by past experience, if it is
> there, we will hit it. Murphy says so.
>
> Thx.

We had recently a steady flush of bug reports about a WARN() in tpm_tis
driver, from all levels of involvement with the kernel. Even people who
don't know what kernel documentation is, got their message through.

When a WARN() triggers anywhere in the kernel, what people tend to do is
that they go to the distro bugzilla, and the issue is quickly escalated
to the corresponding maintainer.

So, what is the part missing from the equation that should be documented
to the kernel documentation. This not a counter argument per se, I just
don't fully understand what is the missing piece that should be put there.

> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>

/Jarkko