Re: [PATCH] optee: don't fail on unsuccessful device enumeration
From: Volodymyr Babchuk
Date: Thu May 14 2020 - 21:02:06 EST
Hi Sumit,
On Thu, 14 May 2020 at 08:38, Sumit Garg <sumit.garg@xxxxxxxxxx> wrote:
>
> Hi Volodymyr,
>
> On Thu, 14 May 2020 at 06:48, Volodymyr Babchuk <vlad.babchuk@xxxxxxxxx> wrote:
> >
> > Hi Sumit,
> >
> > On Wed, 13 May 2020 at 11:24, Sumit Garg <sumit.garg@xxxxxxxxxx> wrote:
> > >
> > > Hi Volodymyr,
> > >
> > > On Wed, 13 May 2020 at 13:30, Jens Wiklander <jens.wiklander@xxxxxxxxxx> wrote:
> > > >
> > > > Hi Volodymyr,
> > > >
> > > > On Wed, May 13, 2020 at 2:36 AM Volodymyr Babchuk
> > > > <vlad.babchuk@xxxxxxxxx> wrote:
> > > > >
> > > > > optee_enumerate_devices() can fail for multiple of reasons. For
> > > > > example, I encountered issue when Xen OP-TEE mediator NACKed
> > > > > PTA_CMD_GET_DEVICES call.
> > >
> > > Could you share a detailed description of the issue which you are
> > > facing? optee_enumerate_devices() is a simple invocation of pseudo TA
> > > and cases where OP-TEE doesn't provide corresponding pseudo TA are
> > > handled very well.
> >
> > Yes, I did some research and looks like issue is broader, than I
> > expected. It is my fault, that I wasn't paying attention to the tee
> > client support in the kernel. Basically, it is incompatible with the
> > virtualization feature. You see, the main issue with virtual machines
> > is the second stage MMU. Intermediate physical address, that appear to
> > be contiguous for the kernel may be not contiguous in the real
> > physical memory due to 2nd stage MMU mappings. This is the reason I
> > introduced OPTEE_MSG_ATTR_NONCONTIG in the kernel driver.
> >
> > But, looks like kernel-side optee client does not use this feature. It
> > tries to provide SHM buffer as a simple contiguous span of memory. Xen
> > blocks calls with OPTEE_MSG_ATTR_TYPE_TMEM_* but without
> > OPTEE_MSG_ATTR_NONCONTIG , because it can't translate IPAs to PAs for
> > such buffers. This is why call to PTA_CMD_GET_DEVICES fails.
> >
> > Valid fix would be to use OPTEE_MSG_ATTR_NONCONTIG whenever possible.
> >
>
> Thanks for the detailed analysis. It looks like you are missing the
> following fix patch in your tree which basically fixed broken
> tee_shm_alloc() in case dynamic shared memory is enabled (IIRC
> virtualization only supports dynamic shared memory).
>
> commit a249dd200d03791cab23e47571f3e13d9c72af6c
Actually, I have this patch in my tree. So, it does not fixes the
issue. Which is weird, actually. I'm planning to look deeper into
this.
>
> > >
> > > > > This should not result in driver
> > > > > initialization error because this is an optional feature.
> > >
> > > I wouldn't call it an optional feature as there might be real kernel
> > > drivers dependent on this enumeration. Also, it is a simple example to
> > > self test OP-TEE functionality too. So I am not sure how much
> > > functional OP-TEE would be if this basic TA invocation fails.
> >
> > Well, it fixed case when Xen is involved. I think, this is valid
> > combination, when platform have the newest OP-TEE, but slightly older
> > kernel. So, imagine that OP-TEE provides PTA_CMD_GET_DEVICES, but
> > kernel can't use because it uses plain TMEM arguments,which are not
> > supported in virtualized environment.
> >
> > If there are kernel drivers, that depend on this PTA, they would not
> > work in any case. But at least userspace clients still be able to use
> > OP-TEE. This is why I call this feature "optional".
>
> As you can see above, tee_shm_alloc() being broken in your case was
> detected via this simple pseudo TA invocation. So IMO, it would be
> better to keep the existing behaviour as it provides a kind of basic
> OP-TEE driver runtime self test too. Also, I think it would be a
> better user experience to have every OP-TEE interface working rather
> than a partially broken interface.
I can see your point. But I think, that it is good to not to break
backward- and forward- compatibility. Imagine, that user upgrades
OP-TEE without changing the kernel. Previously it worked well, but new
OP-TEE provides new PTA and kernel refuses to load the optee driver
because driver fails to initialize that PTA.
This is basically what happened with me. Platform that I am using does
not provide any OP-TEE devices so I assumed that I can safely ignore
this feature. But, when I flashed the latest OP-TEE build I got dead
optee driver. This is confusing from a user standpoint. You don't
expect that firmware upgrade to another minor version will break
existing setup. My proposed patch at least prints the warning, so user
would know where to look...
Anyways, if we'll find a proper fix before next code freeze, I'd
prefer to drop this particular patch. But let's keep it as a plan
B. What do you think?
--
WBR Volodymyr Babchuk aka lorc [+380976646013]
mailto: vlad.babchuk@xxxxxxxxx