Re: [PATCH v10 05/16] x86/virt/tdx: Add skeleton to enable TDX on demand

From: Huang, Kai
Date: Mon Mar 13 2023 - 21:56:25 EST


On Mon, 2023-03-13 at 16:49 -0700, Isaku Yamahata wrote:
> On Sun, Mar 12, 2023 at 11:08:44PM +0000,
> "Huang, Kai" <kai.huang@xxxxxxxxx> wrote:
>
> > On Wed, 2023-03-08 at 14:27 -0800, Isaku Yamahata wrote:
> > > > +
> > > > +static int try_init_module_global(void)
> > > > +{
> > > > + int ret;
> > > > +
> > > > + /*
> > > > + * The TDX module global initialization only needs to be done
> > > > + * once on any cpu.
> > > > + */
> > > > + spin_lock(&tdx_global_init_lock);
> > > > +
> > > > + if (tdx_global_init_status & TDX_GLOBAL_INIT_DONE) {
> > > > + ret = tdx_global_init_status & TDX_GLOBAL_INIT_FAILED ?
> > > > + -EINVAL : 0;
> > > > + goto out;
> > > > + }
> > > > +
> > > > + /* All '0's are just unused parameters. */
> > > > + ret = seamcall(TDH_SYS_INIT, 0, 0, 0, 0, NULL, NULL);
> > > > +
> > > > + tdx_global_init_status = TDX_GLOBAL_INIT_DONE;
> > > > + if (ret)
> > > > + tdx_global_init_status |= TDX_GLOBAL_INIT_FAILED;
> > >
> > > If entropy is lacking (rdrand failure), TDH_SYS_INIT can return TDX_SYS_BUSY.
> > > In such case, we should allow the caller to retry or make this function retry
> > > instead of marking error stickily.
> >
> > The spec says:
> >
> > TDX_SYS_BUSY The operation was invoked when another TDX module
> > operation was in progress. The operation may be retried.
> >
> > So I don't see how entropy is lacking is related to this error. Perhaps you
> > were mixing up with KEY.CONFIG?
>
> TDH.SYS.INIT() initializes global canary value. TDX module is compiled with
> strong stack protector enabled by clang and canary value needs to be
> initialized. By default, the canary value is stored at
> %fsbase:<STACK_CANARY_OFFSET 0x28>
>
> Although this is a job for libc or language runtime, TDX modules has to do it
> itself because it's stand alone.
>
> From tdh_sys_init.c
> _STATIC_INLINE_ api_error_type tdx_init_stack_canary(void)
> {
> ia32_rflags_t rflags = {.raw = 0};
> uint64_t canary;
> if (!ia32_rdrand(&rflags, &canary))
> {
> return TDX_SYS_BUSY;
> }
> ...
> last_page_ptr->stack_canary.canary = canary;
>
>

Then it is a hidden behaviour of the TDX module that is not reflected in the
spec. I am not sure whether we should handle because: 

1) This is an extremely rare case. Kernel would be basically under attack if
such error happened. In the current series we don't handle such case in
KEY.CONFIG either but just leave a comment (see patch 13).

2) Not sure whether this will be changed in the future.

So I think we should keep as is.