Re: [PATCH v3] x86/sgx: Do not consider unsanitized pages an error

From: Jarkko Sakkinen
Date: Thu Aug 25 2022 - 15:16:13 EST


On Thu, Aug 25, 2022 at 11:38:00AM -0700, Dave Hansen wrote:
> On 8/25/22 11:27, Jarkko Sakkinen wrote:
> > On Thu, Aug 25, 2022 at 07:07:44AM -0700, Dave Hansen wrote:
> >> On 8/25/22 01:08, Jarkko Sakkinen wrote:
> >>> However, if the SGX subsystem initialization is retracted, the sanitization
> >>> process could end up in the middle, and sgx_dirty_page_list be left
> >>> non-empty for legit reasons.
> >> What does "retraction" mean in this context?
> > Rest of the initialization failing or features not detected (-ENODEV).
>
> Can you please work on communicating better descriptions of the
> problems? This really isn't good enough.

Sure, I can put more detail into this patch.

If you speak in general about commit messages, picking the correct
granularity is somewhat easy to fail because different people have
different expectations on that. If denoted, I'm happy to write more
detailed description, if the original is not granular enough.

> I think you're talking about sgx_init(). It launches ksgxd from
> sgx_page_reclaimer_init() which sets about sanitizing the
> 'dirty_page_list'. After launching ksgxd, if later actions in
> sgx_init() (misc_register(), sgx_drv_init(), sgx_vepc_init()) fail,
> ksgxd will be stopped prematurely.

It's a bit more complicated, as either sgx_drv_init() or sgx_vepc_init()
can fail without premature end for ksgxd.

So the exact conditions for premature stop are:

"In sgx_init(), if misc_register() for the provision device fails, and
neither sgx_drv_init() nor sgx_vepc_init() succeeds, then ksgxd will be
prematurely stopped."

> This will leave pages in 'sgx_dirty_page_list' after
> __sgx_sanitize_pages() has completed, which results in a WARN_ON().
>
> The WARN_ON() is really only valid when __sgx_sanitize_pages() runs to
> completion *and* fails to empty 'sgx_dirty_page_list'.

This is correct.

> Is that it?

Just thinking if pr_warn() should be used if running to the completion
and failing to empty the list. A bit more information to the klog on
conditions, and not much extra complexity. What do you think?

> If so, could you please give the changelog another go?

BR, Jarkko