Re: [PATCH] optee: don't fail on unsuccessful device enumeration
From: Sumit Garg
Date: Fri May 15 2020 - 00:55:10 EST
Hi Volodymyr,
On Fri, 15 May 2020 at 06:32, Volodymyr Babchuk <vlad.babchuk@xxxxxxxxx> wrote:
>
> Hi Sumit,
>
> On Thu, 14 May 2020 at 08:38, Sumit Garg <sumit.garg@xxxxxxxxxx> wrote:
> >
> > Hi Volodymyr,
> >
> > On Thu, 14 May 2020 at 06:48, Volodymyr Babchuk <vlad.babchuk@xxxxxxxxx> wrote:
> > >
> > > Hi Sumit,
> > >
> > > On Wed, 13 May 2020 at 11:24, Sumit Garg <sumit.garg@xxxxxxxxxx> wrote:
> > > >
> > > > Hi Volodymyr,
> > > >
> > > > On Wed, 13 May 2020 at 13:30, Jens Wiklander <jens.wiklander@xxxxxxxxxx> wrote:
> > > > >
> > > > > Hi Volodymyr,
> > > > >
> > > > > On Wed, May 13, 2020 at 2:36 AM Volodymyr Babchuk
> > > > > <vlad.babchuk@xxxxxxxxx> wrote:
> > > > > >
> > > > > > optee_enumerate_devices() can fail for multiple of reasons. For
> > > > > > example, I encountered issue when Xen OP-TEE mediator NACKed
> > > > > > PTA_CMD_GET_DEVICES call.
> > > >
> > > > Could you share a detailed description of the issue which you are
> > > > facing? optee_enumerate_devices() is a simple invocation of pseudo TA
> > > > and cases where OP-TEE doesn't provide corresponding pseudo TA are
> > > > handled very well.
> > >
> > > Yes, I did some research and looks like issue is broader, than I
> > > expected. It is my fault, that I wasn't paying attention to the tee
> > > client support in the kernel. Basically, it is incompatible with the
> > > virtualization feature. You see, the main issue with virtual machines
> > > is the second stage MMU. Intermediate physical address, that appear to
> > > be contiguous for the kernel may be not contiguous in the real
> > > physical memory due to 2nd stage MMU mappings. This is the reason I
> > > introduced OPTEE_MSG_ATTR_NONCONTIG in the kernel driver.
> > >
> > > But, looks like kernel-side optee client does not use this feature. It
> > > tries to provide SHM buffer as a simple contiguous span of memory. Xen
> > > blocks calls with OPTEE_MSG_ATTR_TYPE_TMEM_* but without
> > > OPTEE_MSG_ATTR_NONCONTIG , because it can't translate IPAs to PAs for
> > > such buffers. This is why call to PTA_CMD_GET_DEVICES fails.
> > >
> > > Valid fix would be to use OPTEE_MSG_ATTR_NONCONTIG whenever possible.
> > >
> >
> > Thanks for the detailed analysis. It looks like you are missing the
> > following fix patch in your tree which basically fixed broken
> > tee_shm_alloc() in case dynamic shared memory is enabled (IIRC
> > virtualization only supports dynamic shared memory).
> >
> > commit a249dd200d03791cab23e47571f3e13d9c72af6c
>
> Actually, I have this patch in my tree. So, it does not fixes the
> issue. Which is weird, actually. I'm planning to look deeper into
> this.
AFAICT, the only difference here is that it's the kernel memory
registered rather than user-space memory. But I am not very conversant
with the Xen environment. So I hope you will be able to find the root
cause as to why Xen is blocking this invocation.
>
> >
> > > >
> > > > > > This should not result in driver
> > > > > > initialization error because this is an optional feature.
> > > >
> > > > I wouldn't call it an optional feature as there might be real kernel
> > > > drivers dependent on this enumeration. Also, it is a simple example to
> > > > self test OP-TEE functionality too. So I am not sure how much
> > > > functional OP-TEE would be if this basic TA invocation fails.
> > >
> > > Well, it fixed case when Xen is involved. I think, this is valid
> > > combination, when platform have the newest OP-TEE, but slightly older
> > > kernel. So, imagine that OP-TEE provides PTA_CMD_GET_DEVICES, but
> > > kernel can't use because it uses plain TMEM arguments,which are not
> > > supported in virtualized environment.
> > >
> > > If there are kernel drivers, that depend on this PTA, they would not
> > > work in any case. But at least userspace clients still be able to use
> > > OP-TEE. This is why I call this feature "optional".
> >
> > As you can see above, tee_shm_alloc() being broken in your case was
> > detected via this simple pseudo TA invocation. So IMO, it would be
> > better to keep the existing behaviour as it provides a kind of basic
> > OP-TEE driver runtime self test too. Also, I think it would be a
> > better user experience to have every OP-TEE interface working rather
> > than a partially broken interface.
>
> I can see your point. But I think, that it is good to not to break
> backward- and forward- compatibility. Imagine, that user upgrades
> OP-TEE without changing the kernel. Previously it worked well, but new
> OP-TEE provides new PTA and kernel refuses to load the optee driver
> because driver fails to initialize that PTA.
>
> This is basically what happened with me. Platform that I am using does
> not provide any OP-TEE devices so I assumed that I can safely ignore
> this feature. But, when I flashed the latest OP-TEE build I got dead
> optee driver. This is confusing from a user standpoint. You don't
> expect that firmware upgrade to another minor version will break
> existing setup. My proposed patch at least prints the warning, so user
> would know where to look...
Warning prints aren't much useful in the sense that they can't be
detected via current OP-TEE CI.
>
> Anyways, if we'll find a proper fix before next code freeze, I'd
> prefer to drop this particular patch. But let's keep it as a plan
> B. What do you think?
Since it seems like currently the kernel internal interface is broken
with virtualization support. So how about "plan B" being skipping the
enumeration in case "OPTEE_SMC_SEC_CAP_VIRTUALIZATION" is set? As we
can't expect to get TEE kernel drivers working without getting this
interface fixed. Also, having an information message that the kernel
internal interface is not supported with virtualization would be
useful too.
-Sumit
>
> --
> WBR Volodymyr Babchuk aka lorc [+380976646013]
> mailto: vlad.babchuk@xxxxxxxxx