Re: [PATCH v2 2/6] x86/sgx: Do not consider unsanitized pages an error

From: Huang, Kai
Date: Thu Sep 01 2022 - 06:50:41 EST


On Wed, 2022-08-31 at 13:39 -0700, Reinette Chatre wrote:
> >   static int ksgxd(void *p)
> >   {
> > + long ret;
> > +
> >    set_freezable();
> >  
> >    /*
> >    * Sanitize pages in order to recover from kexec(). The 2nd pass is
> >    * required for SECS pages, whose child pages blocked EREMOVE.
> >    */
> > - __sgx_sanitize_pages(&sgx_dirty_page_list);
> > - __sgx_sanitize_pages(&sgx_dirty_page_list);
> > + ret = __sgx_sanitize_pages(&sgx_dirty_page_list);
> > + if (ret == -ECANCELED)
> > + /* kthread stopped */
> > + return 0;
> >  
> > - /* sanity check: */
> > - WARN_ON(!list_empty(&sgx_dirty_page_list));
> > + ret = __sgx_sanitize_pages(&sgx_dirty_page_list);
> > + switch (ret) {
> > + case 0:
> > + /* success, no unsanitized pages */
> > + break;
> > +
> > + case -ECANCELED:
> > + /* kthread stopped */
> > + return 0;
> > +
> > + default:
> > + /*
> > + * Never expected to happen in a working driver. If it
> > happens
> > + * the bug is expected to be in the sanitization process,
> > but
> > + * successfully sanitized pages are still valid and driver
> > can
> > + * be used and most importantly debugged without issues. To
> > put
> > + * short, the global state of kernel is not corrupted so no
> > + * reason to do any more complicated rollback.
> > + */
> > + pr_err("%ld unsanitized pages\n", ret);
> > + }
> >  
> >    while (!kthread_should_stop()) {
> >    if (try_to_freeze())
>
>
> I think I am missing something here. A lot of logic is added here but I
> do not see why it is necessary.  ksgxd() knows via kthread_should_stop() if
> the reclaimer was canceled. I am thus wondering, could the above not be
> simplified to something similar to V1:
>
> @@ -388,6 +393,8 @@ void sgx_reclaim_direct(void)
>  
>  static int ksgxd(void *p)
>  {
> + unsigned long left_dirty;
> +
>   set_freezable();
>  
>   /*
> @@ -395,10 +402,10 @@ static int ksgxd(void *p)
>   * required for SECS pages, whose child pages blocked EREMOVE.
>   */
>   __sgx_sanitize_pages(&sgx_dirty_page_list);
> - __sgx_sanitize_pages(&sgx_dirty_page_list);
>  
> - /* sanity check: */
> - WARN_ON(!list_empty(&sgx_dirty_page_list));
> + left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
> + if (left_dirty && !kthread_should_stop())
> + pr_err("%lu unsanitized pages\n", left_dirty);
>  

This basically means driver bug if I understand correctly. To be consistent
with the behaviour of existing code, how about just WARN()?

...
left_dirty = __sgx_sanitize_pages(&sgx_dirty_page_list);
WARN_ON(left_dirty && !kthread_should_stop());

It seems there's little value to print out the unsanitized pages here. The
existing code doesn't print it anyway.

--
Thanks,
-Kai